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GLYCOSIDASE ENZYMES 
BACKGROUND OF THE INVENTION 

1 Field of the Inventions 

^vemionrdatesto^^^ 
such ^nucleotides, the use of such polynucleotides and polypeptides, as we 1 as 

o.ynuc.eotides and polypeptides of the present invention has been putanve ly .denufied as 
Zsidases, ^actosidases, P^ctosidases, B—i dases, B-mannanases, 

endoglucanases, and pullalanases. 

2 Description of Related Art 

The glycosidic bond of P-galactosides can be cleaved by different closes o 
enzyTO es: * pho S pho,- g a,ctosidases (EC3.2.1.85) are specif, for ap^pho^ 
substmegeneratedviaphosphoeno.py^^ 

^L(i typicaip^^ 
C datively specif, for P- g alactosides; and (iii) p-glucosidases (EC 

7") such as the enzymes of 4r— — ~ "^i 

Nation and charactenzation of a P-g.ucosidase from Aica,^ faecai, Can . 
I T Cel. Biol 64 914-922; Kengen, S.W.M., et al. (1993) Eur. J. Biochem., 213, 
B.ochem.CeU.B.ol.64 91 . (1982) Propertie s of P -glucosidase purified 

305-3l2;Ait,N.,Cruezet,N.andCananeo,J.U' > nW ,. g „ n 

, , eiium J Gen Microbiol. 128, 569-577; Grogan, D.W. (1991) 
fmm Clostridium thermocellum. J. uen. w u 

the latter group, although nigmy speu Hvdrolvze S- 

U.g.ycosidic linkage, often display a rather re.axed substrate spectfiaty and hydrolyze P 

glucosidesaswellasP-fucosidesandP-galactosides. 
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Generally, a-galactosidases are enzymes that catalyze the hydrolysis of galactose 
groups on a polysaccharide backbone or hydrolyze the cleavage of di- or oligosaccharides 
comprising galactose. 

Generally, fl-mannanases are enzymes that catalyze the hydrolysis of mannose 
groups internally on a polysaccharide backbone or hydrolyze the cleavage of di- or 
oligosaccaharides comprising mannose groups. fl-mannosidases hydrolyze non-reducing, 
terminal mannose residues on a mannose-containing polysaccharide and the cleavage of di- 
or oligosaccaharides comprising mannose groups. 

Guar gum is a branched galactomannan polysaccharide composed of P- 1,4 linked 
mannose backbone with a- 1 ,6 linked galactose side chains. The enzymes required for the 
degradation of guar are P-mannanase, P-mannosidase and a-galactosidase. P-mannanase 
hydrolyses the mannose backbone internally and p-mannosidase hydrolyses non-reducing, 
terminal mannose residues, a-galactosidase hydrolyses a-Iinked galactose groups. 

Galactomannan polysaccharides and the enzymes that degrade them have a variety 
of applications. Guar is commonly used as a thickening agent in food and is utilized in 
hydraulic fracturing in oil and gas recovery. Consequently, galactomannanases are 
industrially relevant for the degradation and modification of guar. Furthermore, a need 
exists for thermostable galactomannases that are active in extreme conditions associated 
with drilling and well stimulation. 

There are other applications for these enzymes in various industries, such as in the 
beet sugar industry. 20-30% of the domestic U.S. sucrose consumption is sucrose from 
sugar beets. Raw beet sugar can contain a small amount of raffinose when the sugar beets 
are stored before processing and rotting begins to set in. Raffinose inhibits the 
crystallization of sucrose and also constitutes a hidden quantity of sucrose. Thus, there is 
merit to eliminating raffinose from raw beet sugar. ce-Galactosidase has also been used as 
a digestive aid to break down raffinose, stachyose, and verbascose in such foods as beans 
and other gassy foods. 

P-galactosidases which are active and stable at high temperatures appear to be 
superior enzymes for the production of lactose-free dietary milk products (Chaplin, M.F. 
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and Bucke, C. (1990) In: Enzyme Technology, pp. 159-160, Cambridge University Press, 
Cambridge, UK). Also, several studies have demonstrated the applicability of P- 
galactosidases to the enzymatic synthesis of oligosaccharides via transglycosylation 
reactions (Nilsson, K.G.I. (1988) Enzymatic synthesis of oligosaccharides. Trends 
Biotechnol. 6, 156-264; Cote, G.L. and Tao, B.Y: (1990) Oligosaccharide synthesis by 
enzymatic transglycosylation. Glycoconjugate J. 7, 145-162). Despite the commercial 
potential, only a few p.galactosidases of thermophiles have been characterized so far. Two 
genes reported are p-galactoside-cleaving enzymes of the hyperthermophilic bacterium 
Thermotoga maritima, one of the most thermophilic organotrophic eubacteria described to 
date (Huber, R., Langworthy, T.A., Konig, H., Thomm, M, Woese, C.R., Sleytr, U.B. and 
Stetter, K.O. (1986) T. mamma sp. nov. represents a new genus of unique extremely 
thermophilic eubacteria growing up to 90°C, Arch. Microbiol. 144, 324-333) one of the 
most thermophilic organotrophic eubacteria described to date. The gene products have been 
identified as a p-galactosidase and a p-glucosidase. 

Pullulanase is well known as a debranching enzyme of pullulan and starch. The 
enzyme hydrolyzes a-l,6-glucosidic linkages on these polymers. Starch degradation for 
the production or sweeteners (glucose or maltose) is a very important industrial application 
of this enzyme. The degradation of starch is developed in two stages. The first stage 
involves the liquefaction of the substrate with a-amylase, and the second stage, or 
saccharification stage, is performed by ll-amylase with pullalanase added as a debranching 
enzyme, to obtain better yields. 

Endoglucanases can be used in a variety of industrial applications. For instance, the 
endoglucanases of the present invention can hydrolyze the internal B-l,4-glycosidic bonds 
in cellulose, which may be used for the conversion of plant biomass into fuels and 
chemicals. Endoglucanases also have applications in detergent formulations, the textile 
industry, in animal feed, in waste treatment, and in the fruit juice and brewing industry for 
the clarification and extraction of juices. 
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Brief Description of the Drawing s 

The following drawings are illustrative of embodiments of the invention and are not 
meant to limit the scope of the invention as encompassed by the claims. 

Figures la-b are the full-length DNA and corresponding deduced amino acid 
sequence of Ml 1TL of the present invention. Sequencing was performed using a 378 
automated DNA sequencer for all sequences of the present invention (Applied Biosystems, 
Inc.). 

Figure 2 is an illustration of the full-length DNA and corresponding deduced amino 
acid sequence of OC1/4V-33B/G. 

Figure 3 is an Ulustrarion of the full-length DNA and corresponding deduced amino 
acid sequence of F1-12G. 

Figures 4a-b are the full-length DNA and corresponding deduced amino acid 
sequence of 9N2-31B/G. 

Figures 5a-b are the full-length DNA and corresponding deduced amino acid 
sequence of MSB8-6G. 

Figure 6 is the full-length DNA and corresponding deduced amino acid sequence 
of AEDII12RA-18B/G. 

Figures 7a-b are the full-length DNA and corresponding deduced amino acid 
sequence of GC74-22G. 

Figures 8a-b are the full-length DNA and corresponding deduced amino acid 
sequence of VC 1 -7G 1 . 

Figures 9a-c are the full-length DNA and corresponding deduced amino acid 
sequence of37GPL 

Figures lOa-c are the full-length DNA and corresponding deduced amino acid 
sequence of 6GC2. 

Figures lla-d are the full-length DNA and corresponding deduced amino acid 
sequence of 6GP2. 

Figures 12a-c are the full-length DNA and corresponding deduced amino acid 
sequence of 63GB1. 
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Figures 13a-b are the full-length DNA and corresponding deduced amino acid 

sequence ofOCl/4V. 

Figures 14a-e are the full-length DNA and corresponding deduced amino acid 

sequence of 6GP3. 

Figures I5a-d are the full-length DNA and corresponding deduced amino acid 
sequence of Thermotoga maritima MSB8-6GP2. 

Figures 16a-c are the full-length DNA and corresponding deduced amino acid 
sequence of Thermotoga maritima MSB8-6GB4. 

Figures 17a-d are the full-length DNA and corresponding deduced amino acid 
l0 sequence oiBanki gouldi 37GP4. 

Figures 18a-b are the full-length DNA and corresponding deduced amino acid- 
sequence of Pyrococcus furiosus VC 1 -7EG 1 . 

SUMMARY OF THE INVENTION 

In a preferred embodiment of the present invention, there are provided isolated 
15 nucleic acids (polynucleotides) which encode mature enzymes having the deduced amino 

acid sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64). 

In another embodiment, the invention provides a method for producing a 
polypeptide including culturing host cells containing the polynucleotide of Figures 1-18 and 
expressing from the host cell a polypeptide encoded by the polynucleotide and isolating the 

20 polypeptide. 

In another embodiment, the invention provides an enzyme selected from the group 
consisting of an enzyme having an amino acid sequence set forth in SEQ ID NOS: 1 5-28 
or 61 -64 and an enzyme which has at least 30 consecutive amino acid residue as an enzyme 
having an amino acid sequence set forth in SEQ ID NOS: 15-28 or 61-64. 

2 . ^ yet another embodiment, the invention provides a method for generating glucose 

from soluble cell oligosaccharides which includes contacting a sample containing 
oligosaccharides with an effective amount of an enzyme selected from the group of 
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enzymes having the amino acid sequence set forth in SEQ ID NOS: 15-28, 61-63 and 64 
such that glucose is produced 

The publications discussed herein are provided solely for their disclosure prior to 
the filing date of the present application. Nothing herein is to be construed as an 
admission that the invention is not entitled to antedate such disclosure by virtue of prior 
invention. 

Definitions 

"Monosaccharide", as used herein, refers to a single polyhydroxy aldehyde or 
ketone unit. 

"Oligosaccharide", as used herein, consist of short chains of monosaccharide units 
joined together by covalent bonds. Of these, the most abundant are the disaccharides, 
which have two monosaccharide units. 

"Polysaccharide", as used herein, consists of long chains having many 
monosaccharide units. 

The term "gene" means the segment of DNA involved in producing a polypeptide 
chain; it includes regions preceding and following the coding region (leader and trailer) as 
well as intervening sequences (introns) between individual coding segments (exons). 

A coding sequence is "operably linked to" another coding sequence when RNA 
polymerase will transcribe the two coding sequences into a single mRNA, which is then 
translated into a single polypeptide having amino acids derived from both coding 
sequences. The coding sequences need not be contiguous to one another so long as the 
expressed sequences ultimately process to produce the desired protein. 

"Recombinant" enzymes refer to enzymes produced by recombinant DNA 
techniques; Le., produced from cells transformed by an exogenous DNA construct encoding 
the desired enzyme. "Synthetic" enzymes are those prepared by chemical synthesis. 

A DNA "coding sequence of or a "nucleotide sequence encoding" a particular 
enzyme, is a DNA sequence which is transcribed and translated into an enzyme when 
placed under the control of appropriate regulatory sequences. 
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Detailed Description of the Invention 

The polynucleotides and polypeptides of the present invention have been identified 
as glucosidases, cc-galactosidases ? P-galactosidases, B-mannosidases, fl-mannanases, 
endoglucanases, and pullalanases as a result of their enzymatic activity. 

In accordance with one aspect of the present invention, there are provided novel 
enzymes, as well as active fragments, analogs and derivatives thereof. 

In accordance with another aspect of the present invention, there are provided 
isolated nucleic acid molecules encoding the enzymes of the present invention including 
rnRNAs, cDNAs, genomic DNAs as well as active analogs and fragments of such enzymes. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for producing such polypeptides by recombinant techniques comprising cuituring 
recombinant prokaryotic and/or eukaryotic host cells, containing a nucleic acid sequence 
of the present invention, under conditions promoting expression of said enzymes and 
subsequent recovery of said enzymes. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for utilizing such enzymes, or polynucleotides encoding such enzymes for 
hydrolyzing lactose to galactose and glucose for use in the food processing industry, the 
pharmaceutical industry, for example, to treat intolerance to lactose, as a diagnostic reporter 
molecule, in com wet milling, in the fruit juice industry, in baking, in the textile industry 
and in the detergent industry. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for utilizing such enzymes for hydrolyzing guar gum (a galactomannan 
polysaccharide) to remove non-reducing terminal mannose residues. Further 
polysaccharides such as galactomannan and the enzymes according to the invention that 
degrade them have a variety of applications. Guar gum is commonly used as a thickening 
agent in food and also is utilized in hydraulic fracturing in oil and gas recovery. 
Consequently, mannanases are industrially relevant for the degradation and modification 
of guar gums. Furthermore, a need exists for thermostable mannases that are active in 
extreme conditions associated with drilling and well stimulation. 
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In accordance with yet a further aspect of the present invention, there are also 
provided nucleic acid probes comprising nucleic acid molecules of sufficient length to 
specifically hybridize to a nucleic acid sequence of the present invention. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for utilizing such enzymes, or polynucleotides encoding such enzymes, for in 
vitro purposes related to scientific research, for example, to generate probes for identifying 
similar sequences which might encode similar enzymes from other organisms by using 
certain regions, i.e.. conserved sequence regions, of the nucleotide sequence. 

These and other aspects of the present invention should be apparent to those skilled 
in the art from the teachings herein. 

The polynucleotides of this invention were originally recovered from genomic gene 
libraries derived from the following organisms: 

M11TL is a new species of Desulfurococcus isolated from Diamond Pool in 
Yellowstone National Park. The organism grows optimally at 85-88 °C, pH 7.0 in a low salt 
medium containing yeast extract, peptone, and gelatin as substrates with a N 2 /C0 2 gas 
phase. 

OC1/4V is from the genus Thermotoga. The organism was isolated from 
Yellowstone National Park. It grows optimally at 75 °C in a low salt medium with cellulose 
as a substrate and N 2 in gas phase. 

Pyrococcus furiosus VC1 and (7EG1) is from the genus Pyrococcus. VCl was 

isolated from Vulcano, Italy. It grows optimally at 100°C in a high salt medium (marine) 
containing elemental sulfur, yeast extract, peptone and starch as substrates and N 2 in gas 
phase. 

Staphylothermits marinus Fl is a from the genus Staphylothermns. Fl was isolated 
from Vulcano, Italy. It grows optimally at 85 °C, pH 6.5 in high salt medium (marine) 
containing elemental sulfur and yeast extract as substrates and N 2 in gas phase. 
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Thermococcus 9N-2 is from the genus Thermococcus 9N-2 was isolated from 
diffuse vent fluid in .the East Pacific Rise. It is a strict anaerobe that grows optimally at 
87°C. 

Thermotoga maritima MSB8 and MSB8 (Clone # 6GP2 and 6GB4) is from the 
genus Thermoiogo, andwas isolated from Vulcano, Italy. MSB8 grows optimally at 85 °C, 
P H 6.5 in a high salt medium (marine) containing starch and yeast extract as substrates and 
N 2 in gas phase. 

Thermococcus alcaliphilus AEDII12RA is from the genus Thermococcus. 
AEDII12RA grows optimally at 85°C, pH 9.5 in a high salt medium (marine) containing 
polysulfides and yeast extract as substrates and N 2 in gas phase. 

Thermococcus chitonophagus GC74 is from the genus Thermococcus. GC74 grows 
optimally at 85°C, pH 6.0 in a high salt medium (marine) containing chitin, meat extract, 
elemental sulfur and yeast extract as substrates and N, in gas phase. AEPII la grows 
optimally at 85°C at pH 6.5 in marine medium under anaerobic conditions. It has many 
substrates. Bankia gouldi is from the genus Bankia. 

Accordingly, the polynucleotides and enzymes encoded thereby are identified by 
the organism from which they were isolated, and are sometimes hereinafter referred to as 
"Ml 1TL" (Figure 1 and SEQ ID NOS:l and 15), "OC1/4V-33B/G" (Figure 2 and SEQ ID 
NOS:2 and 16), "F1-12G" (Figure 3 and SEQ IDNOS:3 and 17), "9N2-31B/G" (Figure 4 
and SEQ IDNOS:4 and 18), "MSB8" (Figure 5 and SEQ ID NOS:5 and 19), "AEDII12RA- 
18B/G" (Figure 6 and SEQ ID NOS:6 and 20), "GC74-22G" (Figure 7 and SEQ ID NOS:7 
and 21), "VC1-7G1" (Figure 8 and SEQ ID NOS:8 and 22), "37GP1" (Figure 9 and SEQ 
ID NOS: 9 and 23), "6GC2" (Figure 10 and SEQ ID NOS: 10 and 24), "6GP2" (Figure 1 1 
and SEQ ID NOS: 11 and 25), "AEPII la" (Figure 12 and SEQ ID NOS: 12 and 26), 
"OC1/4V" (Figure 13 and SEQ ID NOS:13 and 27), and "6GP3" (Figure 14 and SEQ ID 
NOS:28), "MSB8-6GP2" (Figure 1 5 and SEQ ID NOS.57 and 61), "MSB8-6GB4"(Figure 
16 and SEQ ID NOS:58 and 62),"VCl-7EGl"(Figure 17 and SEQ ID NOS:59 and 63), 
and 37GP4 (Figure 18 and SEQ ID NOS:60 and 64). 
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The polynucleotides and polypeptides of the present- invention show identity at the 
nucleotide and protein level to known genes and proteins encoded thereby as shown in 
Table L 

Table 1 



Clone 


Gene/Protein with 
Closest Homology 


Protein 
Identitv 


Nucleic 

Acid 
Identity 


Ml 1TL-29G 


Sulfolobus suifataricus 
DSM 1616/Pl,p- 
galactosidase 


CIO/ 


C CO/ 
JJ70 


OC1/4V-33B/G 


Caldocellum 
saccharolyticum, p- 
glucosidase 


52% 


57% 


Staphylothermus 

mfiri m/c T* 1 _1 7ft 
friuririui ill ._.VJ 


Bacillus polymyxa, p- 


36% 


48% 


Thermococcus 9N2- 
31B/G 


Sulfolobus suifataricus 
ATCC 49255/MT4 f p- 
ealactosidase 


51% 


50% 


Thermotoga maritima 
MSB8-6G 


Clostridium thermocellum 
bslB 


45% 


53% 


Thermococcus 
AEDII12RA-18B/G 


Bacillus polymyxa, P- 
galactosidase 


34% 


48% 


Thermococcus 
chitonophagits GC74- 
22G 


Sulfolobus suifataricus. 
ATCC 49255/MT4, p~ 
ealactosidase 


46% 


54% 
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J3y jv ri /~* /~i i / £ /jjjv/iCJ/C 
ryrULULL tt<j J lit iUdUJ 

VC1-7G1 


Sulfolobus 
sulfataricus/MT-4 P- 
galactosidase 


46.4% 


52.5% 


Thermotogo maritima 
(6GC2) 


Pediococcus pentosaceaus 
ct-galactosidase 


49% 


29% 


Thermotoga maritima 

(1 rv^ annnnocp ( nliPv 1 


Aspergillus aculeatus 
mannanase . 


56% 


37% 


AEPII lafl- 
mannosidase (63 GB 1 ) 


Sulfolobus solfactaricus ft- 
galactosidase 


78% 


56% 


0C1/4V 

endoglucanase 

(33GP1) 


Clostridium thermocellum 
endo- 1 4-B-endoslucanase 


65% 


43% 


Thermotoga maritima 
pullalanase (6GP3) 


Caldocellum 
saccharolyticum ct- 
destrom 6 
glucanohydralase 


72 


53 


Bankia gouldi mix 

Endoglucanase 

(37GP1) 


None available 







The polynucleotides and enzymes of the present invention show homology to each 
other as shown in Table 2. 
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Table 2 



Clone 


Gene/Protein with 
Closest Homology 


Protein 
Identity 


Nucleic 

Acid 
Identity 


Staphylothermus 
marinus F1-12G 


Thermococcus 

AEDI1 1 2RA- 1 8B/G, p- 

galactosidase. glucosidase 


55% 


57% 


Thermococcus 9N2- 
31B/G 


Thermococcus 
chitonophagus GC74- 
22G-glucosidase' 


74% 


66% 


Pyrococcus fuhosus 
VC1-7G1 


Pyrococcus furiosus VC1- 
7B/G P-galactosidase 


46.4% 


54% 



All the clones identified in Tables I and 2 encode polypeptides which have ct- 
glycosidase or p-glycosidase activity. 

This invention, in addition to the isolated nucleic acid molecules encoding the 
enzymes of the present invention, also provide substantially similar sequences. Isolated 
nucleic acid sequences are substantially similar if: (i) they are capable of hybridizing under 
conditions hereinafter described, to the polynucleotides of SEQ ID NOS: 1 -14 and 57-60; 
(ii) or they encode DNA sequences which are degenerate to the polynucleotides of SEQ ID 
NOS: 1-14 and 57-60. Degenerate DNA sequences encode the amino acid sequences of 
SEQ ID NOS: 1 5-28 and 61-64, but have variations in the nucleotide coding sequences. As 
used herein, substantially similar refers to the sequences having similar identity to the 
sequences of the instant invention. The nucleotide sequences that are substantially the same 
can be identified by hybridization or by sequence comparison. Enzyme sequences that are 
substantially the same can be identified by one or more of the following: proteolytic 
digestion, gel electrophoresis and/or microsequencing. 

One means for isolating the nucleic acid molecules encoding the enzymes of the 
present invention is to probe a gene library with a natural or artificially designed probe 
using art recognized procedures (see, for example: Current Protocols in Molecular Biology, 
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Ausubel F.M. et al (EDS.) Green Publishing Company Assoc. and John Wiley Interscience, 
New York, 1989, 1992). It is appreciated to one skilled in the art that the polynucleotides 
of SEQ ID NOS: 1-14 and 57-60 or fragments thereof (comprising at least 12 contiguous 
nucleotides), are particularly useful probes. Other particular useful probes for this purpose 
are hybridizable fragments to the sequences of SEQ ID NOS: 1-14 and 57-60 (i.e., 
comprising at least 1 2 contiguous nucleotides). 

With respect to nucleic acid sequences which hybridize to specific nucleic acid 
sequences disclosed herein, hybridization may be carried out under conditions of reduced 
stringency, medium stringency or even stringent conditions. As an example of 
oligonucleotide hybridization, a polymer membrane containing immobilized denatured 
nucleic acids is first prehybridized for 30 minutes at 45 °C in a solution consisting of 0.9 M 
NaCL 50 mM NaH : PO, 5 pH 7.0. 5.0 mM N&EDTA, 0.5% SDS, 10X Denhardt's, and 0.5 
mg/ml polyriboadenylic acid. Approximately 2 X 10 7 cpm (specific activity 4-9 X it) 
cpm/ug) of 32 P end-labeled oligonucleotide probe are then added to the solution. After 12- 
16 hours of incubation, the membrane is washed for 30 minutes at room temperature in IX 
SET (1 50 mM NaCl, 20 mM Tris hydrochloride, pH 7.8, 1 mM NaJEDTA) containing 0.5% 
SDS, followed by a 30 minute wash in fresh IX SET at Tm 10°C for the oligonucleotide 
probe. The membrane is then exposed to auto-radiographic film for detection of 
hybridization signals. 

Stringent conditions means hybridization will occur only if there is at least 90% 
identity, preferably at least 95% identity and most preferably at least 97% identity between 
the sequences. Further, it is understood that a section of a 100 bps sequence that is 95 bps 
in length has 95% identity with the 1090 bps sequence from which it is obtained. See J. 
Sambrook et aL Molecular Cloning, A Laboratory Manual, 2d Ed, Cold Spring Harbor 
Laboratory (1989) which is hereby incorporated by reference in its entirety. Also, it is 
understood that a fragment of a 100 bps sequence that is 95 bps in length has 95% identity 
with the 100 bps sequence from which it is obtained. 

As used herein, a first DNA (RNA) sequence is at least 70% and preferably at least 
80% identical to another DNA (RNA) sequence if there is at least 70% and preferably at 
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least a 80% or 90% identity, respectively,. between the bases of the first sequence and the 
bases of the another sequence, when properly aligned with each other, for example when 
aligned by BLASTN. 

"Identity" as the term is used herein, refers to a polynucleotide sequence which 
comprises a percentage of the same bases as a reference polynucleotide (SEQ ID NOS: 1-14 
and 57-60). For example, a polynucleotide which is at least 90% identical to a reference 
polynucleotide, has polynucleotide bases which are identical in 90% of the bases which 
make up the reference polynucleotide and may have different bases in 10% of the bases 
which comprise that polynucleotide sequence. 

The present invention relates polynucleotides which differ from the reference 
polynucleotide such that the changes are silent changes, for example the change do not alter 
the amino acid sequence encoded by the polynucleotide. The present invention also relates 
to nucleotide changes which result in amino acid substitutions, additions, deletions, fusions 
and truncations in the polypeptide encoded by the reference polynucleotide. In a preferred 
aspect of the invention these polypeptides retain the same biological action as the 
polypeptide encoded by the reference polynucleotide. 

It is also appreciated that such probes can be and are preferably labeled with an 
analytically detectable reagent to facilitate identification of the probe. Useful reagents 
include but are not limited to radioactivity, fluorescent dyes or enzymes capable of 
catalyzing the formation of a detectable product. The probes are thus useful to isolate 
complementary copies of DNA from other sources or to screen such sources for related 
sequences. 

The polynucleotides of this invention were recovered from genomic gene libraries 
from the organisms listed in Table 1 . For example, gene libraries can be generated in the 
Lambda ZAP II cloning vector (Stratagene Cloning Systems). Mass excisions can be 
performed on these libraries to generate libraries in the pBluescript phagemid. Libraries 
are thus generated and excisions performed according to the protocols/methods hereinafter 
described. 
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The excision libraries are introduced into the E coli strain BW14893 FkanlA. 
Expression clones are then identified using a high temperature filter assay. Expression 
clones encoding several glucanases and several other giycosidases are identified and 
repurified. The polynucleotides, and enzymes encoded thereby, of the present invention, 
yield the activities as described above. 

The coding sequences for the enzymes of the present invention were identified by 
screening the genomic DNAs prepared for the clones having giucosidase or galactosidase 
activity. 

An example of such an assay is a high temperature filter assay wherein expression 
clones were identified by use of high temperature filter assays using buffer Z (see recipe 
below) containing 1 mg/ml of the substrate 5-bromo-4-chloro-3-indolyl-P-D- 
glucopyranoside (XGLU) (Diagnostic Chemicals Limited or Sigma) after introducing an 
excision library into the E coli strain BW14893 FkanlA. Expression clones encoding 
XGLUases were identified and repurified from Ml ITU OC1/4V, Pyrococcus fiiriosus VC1, 
Staphylothemus marinus Fl, Thermococcus 9N-2, Thermotoga maritima MSB8, 
Thermococcus alcaliphiius AEDII12RA, and Thermococcus chitonophagus GC74. 

Z-buffer: (referenced in Miller, J.H. (1992) A Short Course in Bacterial Genetics, 

p. 445.) 

per liter: 

Na : HPO r 7H 2 0 16.1g 
NaH 3 P0 4 -7H,0 >.5g 
KCl °- 75 g 
MgS0 4 -7H 2 0 <>-246g 
P-mercaptoethanol 2.7ml 
Adjust pH to 7.0 

Hi gh Temperatu re Filter Assay 
(1) The f factor fkan (from £ coli strain CSH1 1 8)(1) was introduced into the pho-pnh- 
lac-strain BW1 4893(2). BW13893(2). The filamentous phage library was plated 
on the resulting strain, BW14893 Fkan. (Miller, J.H. (1992) A Short Course in 
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Bacterial Genetics; Lee, K.S., Metcalf, et al., (1992) Evidence for two phosphonate 
degradative pathways in Enterobacter Aerogenes, J. Bacteriol., 174:2501-2510. 
After growth on 100 mm LB plates containing 100 ug/ml ampicillin. 80 ug/ml 
nethicillin and ImM IPTG, colony lifts were performed using Millipore HATF 
membrane filters. 

The colonies transferred to the filters were lysed with chloroform vapor in 150 mm 
glass petri dishes. 

The filters were transferred to 100 mm glass petri dishes containing a piece of 
Whatman 3 MM filter paper saturated with buffer. 

(a) when testing for galactosidase activity (XGALase), 3 MM paper was 
saturated with Z buffer containing 1 mg/ml XGAL (ChemBridge 
Corporation). After transferring filter bearing lysed colonies to the glass 
petri dish, placed dish in oven at 80-85 °C. 

(b) when testing for glucosidase (XGLUase), 3MM paper was saturated 
with Z buffer containing 1 mg/ml XGLU. After transferring filter bearing 
lysed colonies to the glass petri dish, placed dish in oven at 80-85 °C, 

Positives' were observed as blue spots on the filter membranes. Used the following 
filter rescue technique to retrieve plasmid from lysed positive colony. Used pasteur 
pipette (or glass capillary tube) to core blue spots on the filter membrane. Placed 
the small filter disk in an Eppendorf tube containing 20 ul water. Incubated the 
Eppendorf tube at 75 °C for 5 minutes followed by vortexing to elute plasmid DNA 
off filter. This DNA was transformed into electrocompetent E. coli cells DH10B 
for Thermatoga maritima MSB8-6G, Staphylothermus marinus F1-12G, 
Thermococcus AEDII12RA-18B/G, Thermococcus chitonophagus GC74-22G, 
Ml 1T1 and OC1/4V. Electrocompetent BW14893 FkanlA E. coli were used for 
Thermococcus 9N2-3 IB/G, and Pyrococcus furiosus VC1 -7G1 . Repeated filter-lift 
assay on transformation plates to identify 'positives'. Return transformation plates 
to 37 °C incubator after filter lift to regenerate colonies. Inoculate 3 ml LB liquid 
containing 100 ug/ml ampicillin with repurified positives and incubate at 37°C 
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overnight. Isolate plasmid DNA from these cultures and sequence plasmid insert. 
In some instances where the plates used for the initial colony lifts contained non- 
confluent colonies, a specific colony corresponding to a blue spot on the filter could 
be identified on a regenerated plate and repurified directly, instead of using the filter 
rescue technique. 

Another example of such an assay is a variation of the high temperature filter assay 
wherein colony-laden filters are heat-killed at different temperatures (for example, 105°C 
for 20 minutes) to monitor thermostability. The 3MM paper is saturated with different 
buffers (i.e., 100 mM NaCl, 5 mM MgCl 3 , 100 mM Tris-Cl (pH 9.5)) to determine enzyme 
activity under different buffer conditions. 

A P-glucosidase assay may also be employed, wherein Glcp0Np is used as an 
artificial substrate (aryl-P-glucosidase). The increase in absorbance at 405 nm as a result 
of p-nitrophenol (pNp) liberation was followed on a Hitachi U-l 100 spectrophotometer, 
equipped with a thermostatted cuvette holder. The assays may be performed at 80 °C or 
90°C in closed 1-ml quartz cuvette. A standard reaction mixture contains 150 mM 
trisodium substrate, pH 5.0 (at 80 °C), and 0.95 mM pNp derivative pNp = 0.561 mM" 1 cm' 
l ). The reaction mixture is allowed to reach the desired temperature, after which the 
reaction is started by injecting an appropriate amount of enzyme (1.06 ml final volume). 

1 U P-glucosidase activity is defined as that amount required to catalyze the 
formation of 1 .0 pmo\ pNp/min. D-cellobiose may also be used as a substrate. 

An ONPG assay for p-gaiactosidase activity is described by Miller, J.H. (1992) A 
Short Course in Bacterial Genetics and Mill, J.H. (1992) Experiments in Molecular 
Genetics, the contents of which are hereby incorporated by reference in their entirety. 

A quantitative fluorometric assay for p-galactosidase specific activity is described 
by : Youngman P., (1987) Plasmid Vectors for Recovering and Exploiting Tn917 
Transpositions in Bacillus and other Gram-Positive Bacteria. In Plasmids: A Practical 
approach (ed. K. Hardy) pp 79-103. IRL Press, Oxford. A description of the procedure can 
be found in Miller (1992) p. 75-77, the contents of which are incorporated by reference 
herein in their entirety. 
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The polynucleotides of the present invention may be in the form of DNA which 
DNA includes cDNA, genomic DNA, and synthetic DNA. The DNA may be double- 
stranded or single-stranded, and if single stranded may be the coding strand or non-coding 
(anti-sense) strand. The coding sequences which- encodes the mature enzymes may be 
identical to the coding sequences shown in Figures 1-8 (SEQ ID NOS: 1-14 and 57-60) or 
may be a different coding sequence which coding sequence, as a result of the redundancy 
or degeneracy of the genetic code, encodes the same mature enzymes as the DNA of 
Figures 1-18 (SEQ ID NOS: 1-14 and 57-60). 

The polynucleotide which encodes for the mature enzyme of Figures 1-18 (SEQ ID 
NOS: 15-28 and 61-64) may include, but is not limited to: only the coding sequence for the 
mature enzyme; the coding sequence for the mature enzyme and additional coding sequence 
such as a leader sequence or a proprotein sequence; the coding sequence for the mature 
enzyme (and optionally additional coding sequence) and non-coding sequence, such as 
introns or non-coding sequence 5' and/or 3' of the coding sequence for the mature enzyme. 

Thus, the term "polynucleotide encoding an enzyme (protein)" encompasses a 
polynucleotide which includes only coding sequence for the enzyme as well as a 
polynucleotide which includes additional coding and/or non-coding sequence. 

The present invention further relates to variants of the hereinabove described 
polynucleotides which encode for fragments, analogs and derivatives of the enzymes having 
the deduced amino acid sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64). The 
variant of the polynucleotide may be a naturally occurring allelic variant of the 
polynucleotide or a non-naturally occurring variant of the polynucleotide. 

Thus, the present invention includes polynucleotides encoding the same mature 
enzymes as shown in Figures 1-18 (SEQ ID NOS: 15-28 and 61 -64) as well as variants of 
such polynucleotides which variants encode for a fragment, derivative or analog of the 
enzymes of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64). Such nucleotide variants 
include deletion variants, substitution variants and addition or insertion variants. 

As hereinabove indicated, the polynucleotides may have a coding sequence which 
is a naturally occurring allelic variant of the coding sequences shown in Figures 1-18 (SEQ 

18 



WO 98/24799 



PCT/US97/22623 



ID NOS: 1-14 and 57-60), As known in the art, an allelic variant isan alternate form of a 
polynucleotide sequence which may have a substitution, deletion or addition of one or more 
nucleotides, which does not substantially alter the function of the encoded enzyme. 

Fragments of the full length gene of the present invention may be used as a 
hybridization probe for a cDNA or a genomic library to isolate the full length DNA and to 
isolate other DNAs which have a high sequence similarity to the gene or similar biological 
activity. Probes of this type preferably have at least 10, preferably at least 15, and even 
more preferably at least 30 bases and may contain, for example, at least 50 or more bases. 
The probe may also be used to identify a DNA clone corresponding to a full length 
transcript and a genomic clone or clones that contain the complete gene including 
regulatory and promotor regions, exons, and introns. An example of a screen comprises 
isolating the coding region of the gene by using the known DNA sequence to synthesize an 
oliaonucleotide probe. Labeled oligonucleotides having a sequence complementary to that 
of the gene of the present invention are used to screen a library of genomic DNA to 
determine which members of the library the probe hybridizes to. 

The present invention further relates to polynucleotides which hybridize to the 
hereinabove-described sequences if there is at least 70%, preferably at least 90%, and more 
preferably at least 95% identity between the sequences. The present invention particularly 
relates to polynucleotides which hybridize under stringent conditions to the hereinabove- 
described polynucleotides. As herein used, the term "stringent conditions" means 
hybridization will occur only if there is at least 95% and preferably at least 97% identity 
between the sequences. The polynucleotides which hybridize to the hereinabove described 
polynucleotides in a preferred embodiment encode enzymes which either retain 
substantially the same biological function or activity as the mature enzyme encoded by the 
DNA of Figures 1-18 (SEQ ID NOS: 1-14 and 57-60). 

Alternatively, the polynucleotide may have at least 15 bases, preferably at least 30 
bases, and more preferably at least 50 bases which hybridize to any part of a polynucleotide 
of the present invention and which has an identity thereto, as hereinabove described, and 
which may or may not retain activity. For example, such polynucleotides may be employed 
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as probes for the polynucleotides of SEQ ID NOS: 1-14 and 57-60, for example, for 
recovery of the polynucleotide or as a diagnostic probe or as a PCR primer. 

Thus, the present invention is directed to polynucleotides having at least a 70% 
identity, preferably at least 90% identity and more preferably at least a 95% identity to a 
polynucleotide which encodes the enzymes of SEQ ID NOS; 15-28 and 61-64 as well as 
fragments thereof which fragments have at least 15 bases, preferably at least 30 bases and 
most preferably at least 50 bases, which fragments are at least 90% identical, preferably at 
least 95% identical and most preferably at least 97% identical under stringent conditions 
to any portion of a polynucleotide of the present invention. 

The present invention further relates to enzymes which have the deduced amino acid 
sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) as well as fragments, analogs 
and derivatives of such enzyme. 

The terms " fragment," "derivative" and "analog" when referring to the enzymes of 
Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) means enzymes which retain essentially the 
same biological function or activity as such enzymes. Thus, an analog includes a proprotein 
which can be activated by cleavage of the proprotein portion to produce an active mature 
enzyme. 

The enzymes of the present invention may be a recombinant enzyme, a natural 
enzyme or a synthetic enzyme, preferably a recombinant enzyme. 

The fragment, derivative or analog of the enzymes of Figures 1-18 (SEQ ID NOS: 
15-28 and 61-64) may be (i) one in which one or more of the amino acid residues are 
substituted with a conserved or non-conserved amino acid residue (preferably a conserved 
amino acid residue) and such substituted amino acid residue may or may not be one 
encoded by the genetic code, or (ii) one in which one or more of the amino acid residues 
includes a substituent group, or (iii) one in which the mature enzyme is fused with another 
compound, such as a compound to increase the half-life of the enzyme (for example, 
polyethylene glycol), or (iv) one in which the additional amino acids are fused to the mature 
enzyme, such as a leader or secretory sequence or a sequence which is employed for 
purification of the mature enzyme or a proprotein sequence. Such fragments, derivatives 
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and analogs are deemed to be within the scope of those skilled in the art from the teachings 
herein. 

The enzymes and polynucleotides of the present invention are preferably provided 
in an isolated form, and preferably are purified to homogeneity. 

The term "isolated" means that the material is removed from its original 
environment (e.g., the natural environment if it is naturally occurring). For example, a 
naturally-occurring polynucleotide or enzyme present in a living animal is not isolated, but 
the same polynucleotide or enzyme, separated from some or all of the coexisting materials 
in the natural system, is isolated. Such polynucleotides could be part of a vector and/or 
such polynucleotides or enzymes could be part of a composition, and still be isolated in that 
such vector or composition is not part of its natural environment. 

The enzymes of the present invention include the enzymes of SEQ ID NOS: 15-28 
and 61-64 (in particular the mature enzyme) as well as enzymes which have at least 70% 
similarity (preferably at least 70% identity) to the enzymes of SEQ ID NOS: 1 5-28 and 61- 
64 and more preferably at least 90% similarity (more preferably at least 90% identity) to 
the enzymes of SEQ ID NOS: 15-28 and 61-64 and still more preferably at least 95% 
similarity (still more preferably at least 95% identity) to the enzymes of SEQ ID NOS: 15- 
28 and 61-64 and also include portions of such enzymes with such portion of the enzyme 
generally containing at least 30 amino acids and more preferably at least 50 amino acids. 

As known in the art "similarity" between two enzymes is determined by comparing 
the amino acid sequence and its conserved amino acid substitutes of one enzyme to the 
sequence of a second enzyme. 

A variant, i.e. a "fragment", "analog" or "derivative" polypeptide, and reference 
polypeptide may differ in amino acid sequence by one or more substitutions, additions, 
deletions, fusions and truncations, which may be present in any combination. 

Among preferred variants are those that vary from a reference by conservative 
amino acid substitutions. Such substitutions are those that substitute a given amino acid in 
a polypeptide by another amino acid of like characteristics. Typically seen as conservative 
substitutions are the replacements, one for another, among the aliphatic amino acids Ala, 
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Val, Leu and He; interchange of the hydroxy! residues Ser and Thr, exchange of the acidic 
residues Asp and Glu, substitution between the amide residues Asn and Gin, exchange of 
the basic residues Lys and Arg and replacements among the aromatic residues Phe, Tyr. 

Most highly preferred are variants which retain the same biological function and 
activity as the reference polypeptide from which it varies. 

Fragments or portions of the enzymes of the present invention may be employed for 
producing the corresponding full-length enzyme by peptide synthesis; therefore, the 
fragments may be employed as intermediates for producing the full-length enzymes. 
Fragments or portions of the polynucleotides of the present invention may be used to 
synthesize full-length polynucleotides of the present invention. 

The present invention also relates to vectors which include polynucleotides of the 
present invention, host cells which are genetically engineered with vectors of the invention 
and the production of enzymes of the invention by recombinant techniques. 

Host cells are genetically engineered (transduced or transformed or transfected) with 
the vectors of this invention which may be, for example, a cloning vector or an expression 
vector. The vector may be, for example, in the form of a plasmid. a viral particle, a phage, 
etc. The engineered host cells can be cultured in conventional nutrient media modified as 
appropriate for activating promoters, selecting transformants or amplifying the genes of the 
present invention. The culture conditions* such as temperature, pH and the like, are those 
previously used with the host cell selected for expression, and will be apparent to the 
ordinarily skilled artisan. 

The polynucleotides of the present invention may be employed for producing 
enzymes by recombinant techniques. Thus, for example, the polynucleotide may be 
included in any one of a variety of expression vectors for expressing an enzyme. Such 
vectors include chromosomal, nonchromosomal and synthetic DNA sequences, e.g., 
derivatives of SV40; bacterial plasmids; phage DNA; baculovirus; yeast plasmids; vectors 
derived from combinations of plasmids and phage DNA, viral DNA such as vaccinia, 
adenovirus, fowl pox virus, and pseudorabies. However, any other vector may be used as 
long as it is replicable and viable in the host. 
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The appropriate DNA sequence may be inserted into the vector by a variety of 
procedures. In general, the DNA sequence is inserted into an appropriate restriction 
endonuc lease site(s) by procedures known in the-art Such procedures and others are 
deemed to be within the scope of those skilled in the art. 

The DNA sequence in the expression vector is ope rati vely linked to an appropriate 
expression control sequence(s) (promoter) to direct mRNA synthesis. As representative 
examples of such promoters, there may be mentioned: LTR or SV40 promoter, the E. coli. 
lac or trp_, the phage lambda P L promoter and other promoters known to control expression 
of genes in prokaryotic or eukaryotic cells or their viruses. The expression vector also 
contains a ribosome binding site for translation initiation and a transcription terminator. 
The vector may also include appropriate sequences for amplifying expression. 

In addition, the expression vectors preferably contain one or more selectable marker 
genes to provide a phenorypic trait for selection of transformed host cells such as 
dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, or such as 
tetracycline or ampicillin resistance in E. coli . 

The vector containing the appropriate DNA sequence as hereinabove described, as 
well as an appropriate promoter or control sequence, may be employed to transform an 
appropriate host to permit the host to express the protein. 

As representative examples of appropriate hosts, there may be mentioned: bacterial 
cells, such as E. coli , Streotomvces , Bacillus subtilis : fungal cells, such as yeast: insect cells 
such as Drosophila S2 and Spodoptera Sf9; animal cells such as CHO, COS or Bowes 
melanoma; adenoviruses; plant cells, etc. The selection of an appropriate host is deemed 
to be within the scope of those skilled in the art from the teachings herein. 

More particularly, the present invention 1 - also includes recombinant constructs 
comprising one or more of the sequences as broadly described above. The constructs 
comprise a vector, such as a plasmid or viral vector; into which a sequence of the invention 
has been inserted, in a forward or reverse orientation. In a preferred aspect of this 
embodiment, the construct further comprises regulatory sequences, including, for example, 
a promoter, operably linked to the sequence. Large numbers of suitable vectors and 
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promoters are known to those of skill in the art. and are commercially available. The 
following vectors are provided by way of example; Bacterial: pQE70, pQE60, pQE-9 
(Qiagen), pD10 ; psiX174, pBluescript II KS, pNH8A, pNH16a, pNH18A. P NH46A 
(Stratagene); ptrc99a, pfCK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia); Eukaryotic: 
pSV2CAT, pOG44, pXTl, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). 
However, any other plasmid or vector may be used as long as they are replicable and viable 
in the host. 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. Two 
appropriate vectors are pKX232-8 and pCM7. Particular named bacterial promoters include 
lacL lacZ. T3, T7, gpt, lambda P R , P L and trp. Eukaryotic promoters include CMV 
immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and 
mouse metallothionein-I. Selection of the appropriate vector and promoter is well within 
the level of ordinary skill in the art. 

In a further embodiment, the present invention relates to host cells containing the 
above-described constructs. The host cell can be a higher eukaryotic cell, such as a 
mammalian cell, or a lower eukaryotic cell, such as a yeast cell, or the host cell can be a 
prokaryotic cell, such as a bacterial cell. Introduction of the construct into the host cell can 
be effected by calcium phosphate transfection, DEAE-Dextran mediated transfection, or 
etectroporation (Davis, L., Dibner, M., Battey, L Basic Methods in Molecular Biology, 
(1986)). 

The constructs in host cells can be used in a conventional manner to produce the 
gene product encoded by the recombinant sequence. Alternatively, the enzymes of the 
invention can be synthetically produced by conventional peptide synthesizers. 

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells 
under the control of appropriate promoters. Cell-free translation systems can also be 
employed to produce such proteins using RNAs derived from the DNA constructs of the 
present invention. Appropriate cloning and expression vectors for use with prokaryotic and 
eukaryotic hosts are described by Sambrook, et al., Molecular Cloning: A Laboratory 
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Manual, Second Edition, Cold Spring Harbor, N.Y., (1989), the disclosure of which is 
hereby incorporated by reference. 

Transcription of the DNA encoding the enzymes of the present invention by higher 
eukaryotes is increased by inserting an enhancer sequence into the vector. Enhancers are 
cis-acting elements of DNA, usually about from 10 to 300 bp that act on a promoter to 
increase its transcription. Examples include the SV40 enhancer on the late side of the 
replication origin bp 100 to 270, a cytomegalovirus early promoter enhancer, the polyoma 
enhancer on the late side of the replication origin, and adenovirus enhancers. 

Generally, recombinant expression vectors will include origins of replication and 
selectable markers permitting transformation of the host cell, e.g., the ampicillin resistance 
gene of E. coli and S. cerevisiae TRP1 gene, and a promoter derived from a highly- 
expressed gene to direct transcription of a downstream structural sequence. Such promoters 
can be derived from operons encoding glycolytic enzymes such as 3-phosphogiycerate 
kinase (PGK), a-factor, acid phosphatase, or heat shock proteins, among others. The 
heterologous structural sequence is assembled in appropriate phase with translation 
initiation and terniination sequences, and preferably, a leader sequence capable of directing 
secretion of translated enzyme. Optionally, the heterologous sequence can encode a fusion 
enzyme including an N-terminal identification peptide imparting desired characteristics, 
e.g., stabilization or simplified purification of expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by inserting a structural 
DNA sequence encoding a desired protein together with suitable translation initiation and 
termination signals in operable reading phase with a functional promoter. The vector will 
comprise one or more phenotypic selectable markers and an origin of replication to ensure 
maintenance of the vector and to, if desirable, provide amplification within the host. 
Suitable prokaryotic hosts for transformation include E.cpJi, Bacillus subtilis , Salmonella 
tvphimurium and various species within the genera Pseudomonas, Streptomyces, and 
Staphylococcus, although others may also be employed as a matter of choice. 

As a representative but nonlimiting example, useful expression vectors for bacterial 
use can comprise a selectable marker and bacterial origin of replication derived from 
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commercially available plasmids comprising genetic elements of the well known cloning 
vector pBR322 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 
(Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM1 (Promega Biotec. Madison, WL 
USA). These pBR322 "backbone" sections are combined with an appropriate promoter and 
the structural sequence to be expressed. 

Following transformation of a suitable host strain and growth of the host strain to 
an appropriate cell density, the selected promoter is induced by appropriate means (e.g., 
temperature shift or chemical induction) and cells are cultured for an additional period. 

Cells are typically harvested by centrifugation, disrupted by physical or chemical 
means, and the resulting crude extract retained for further purification. 

Microbial cells employed in expression of proteins can be disrupted by any 
convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or 
use of cell lysing agents, such methods are well known to those skilled in the art. 

Various mammalian cell culture systems can also be employed to express 
recombinant protein. Examples of mammalian expression systems include the COS-7 lines 
of monkey kidney fibroblasts, described by Gluzman, Cell, 23:175 (1981), and other cell 
lines capable of expressing a compatible vector, for example, the C127, 3T3, CHO, HeLa 
and BHK cell lines. Mammalian expression vectors will comprise an origin of replication, 
a suitable promoter and enhancer, and also any necessary ribosome binding sites, 
polyadenylation site, splice donor and acceptor sites, transcriptional termination sequences, 
and 5' flanking nontranscribed sequences. DNA sequences derived from the SV40 splice, 
and polyadenylation sites may be used to provide the required nontranscribed genetic 
elements. 

The enzyme can be recovered and purified from recombinant cell cultures by 
methods including ammonium sulfate or ethanol precipitation, acid extraction, anion or 
cation exchange chromatography, phosphoceilulose chromatography, hydrophobic 
interaction chromatography, affinity chromatography, hydroxylapatite chromatography and 
lectin chromatography. Protein refolding steps can be used, as necessary, in completing 
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configuration of the mature protein. Finally, high performance liquid chromatography 
(HPLC) can be employed for final purification steps. 

The enzymes of the present invention may be a naturally purified product, or a 
product of chemical synthetic procedures, or produced by recombinant techniques from a 
prokaryotic or eukaryotic host (for example, by bacterial, yeast, higher plant, insect and 
mammalian cells in culture). Depending upon the host employed in a recombinant 
production procedure, the enzymes of the present invention may be glycosylated or may be 
non-glycosylated. Enzymes of the invention may or may not also include an initial 
methionine amino acid residue. 

p-galactosidase hydroiyzes lactose to galactose and glucose. Accordingly, the 
OC1/4V, 9N2-31B/G, AEDII12RA-18B/G and F1-12G enzymes may be employed in the 
food processing industry for the production of low lactose content milk and for the 
production of galactose or glucose from lactose contained in whey obtained in a large 
amount as a by-product in the production of cheese. Generally, it is desired that enzymes 
used in food processing, such as the aforementioned p-galactosidases, be stable at elevated 
temperatures to help prevent microbial contamination. 

These enzymes may also be employed in the pharmaceutical industry. The enzymes 
are used to treat intolerance to lactose. In this case, a thermostable enzyme is desired, as 
well Thermostable P-galactosidases also have uses in diagnostic applications, where they 
are employed as reporter molecules. 

Giucosidases act on soluble cellooligosaccharides from the non-reducing end to give 
glucose as the sole product. Glucanases (endo- and exo-) act in the depolymerization of 
cellulose, generating more non-reducing ends (endo-glucanases, for instance, act on internal 
linkages yielding cellobiose, glucose and cellooligosaccharides as products). 0- 
glucosidases are used in applications where glucose is the desired product. Accordingly, 
M11TL, F1-12G, GC74-22G, MSB8-6G , OC.1/4V, VC1-7G1, 9N2-31B/G and 
AEDII12RA18B/G may be employed in a wide variety of industrial applications, including 
in corn wet milling for the separation of starch and gluten, in the fruit industry for 
clarification and equipment maintenance, in baking for viscosity reduction, in the textile 
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industry for the processing of blue jeans, and in the detergent industry as an additive. For 
these and other applications, thermostable enzymes are desirable. 

Antibodies generated against the enzymes corresponding to a sequence of the 
present invention can be obtained by direct injection of the enzymes into an animal or by 
administering the enzymes to an animal, preferably a nonhuman. The antibody so obtained 
will then bind the enzymes itself. In this manner, even a sequence encoding only a 
fragment of the enzymes can be used to generate antibodies binding the whole native 
enzymes. Such antibodies can then be used to isolate the enzyme from cells expressing that 
enzyme. 

For preparation of monoclonal antibodies, any technique which provides antibodies 
produced by continuous cell line cultures can be used. Examples include the hybridoma 
technique (Kohler and Milstein, 1975, Nature, 256:495-497), the trioma technique, the 
human B-cell hybridoma technique (Kozbor et al., 1983, Immunology Today 4:72), and the 
EBV-hybridoma technique to produce human monoclonal antibodies (Cole, et al., 1985, in 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

Techniques described for the production of single chain antibodies (U.S. Patent 
4,946,778) can be adapted to produce single chain antibodies to immunogenic enzyme 
products of this invention. Also, transgenic mice may be used to express humanized 
antibodies to immunogenic enzyme products of this invention. 

Antibodies generated against the enzyme of the present invention may be used in 
screening for similar enzymes from other organisms and samples. Such screening 
techniques are known in the art, for example, one such screening assay is described in 
"Methods for Measuring Cellulase Activities", Methods in enzymology, Vol 160, pp. 87- 
1 16, which is hereby incorporated by reference in its entirety. 

The present invention will be further described with reference to the following 
examples; however, it is to be understood that the present invention is not limited to such 
examples. All parts or amounts, unless otherwise specified, are by weight. 

In order to facilitate understanding of the- following examples certain frequently 
occurring methods and/or terms will be described. 
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"Plasmids" are designated by a lower case p preceded and/or followed by capital 
letters and/or numbers. The starting plasmids herein are either commercially available, 
publicly available on an unrestricted basis, or can be constructed from available plasmids 
in accord with published procedures. In addition, equivalent plasmids to those described 
are known in the art and will be apparent to the ordinarily skilled artisan. 

"Digestion" of DNA refers to catalytic cleavage of the DNA with a restriction 
enzyme that acts only at certain sequences in the DNA. .The various restriction enzymes 
used herein are commercially available and their reaction conditions, cofactors and other 
requirements were used as would be known to the ordinarily skilled artisan. For analytical 
purposes, typically 1 ug of plasmid or DNA fragment is used with about 2 units of enzyme 
in about 20 ul of buffer solution. For the purpose of isolating DNA fragments for plasmid 
construction, typically 5 to 50 ug of DNA are digested with 20 to 250 units of enzyme in 
a larger volume. Appropriate buffers and substrate amounts for particular restriction 
enzymes are specified by the manufacturer. Incubation times of about 1 hour at 37°C are 
ordinarily used, but may vary in accordance with the supplier's instructions. After digestion 
the reaction is electrophoresed directly on a polyacryiamide gel to isolate the desired 
fragment. 

Size separation of the cleaved fragments is performed using 8 percent 
polyacryiamide gel described by GoeddeL D. et aL Nucleic Acids Res., 8:4057 (1980). 

"Oligonucleotides" refers to either a single stranded polydeoxynucleotide or two 
complementary polydeoxynucleotide strands which may be chemically synthesized. Such 
synthetic oligonucleotides have no 5' phosphate and thus will not ligate to another 
oligonucleotide without adding a phosphate with an ATP in the presence of a kinase. A 
synthetic oligonucleotide will ligate to a fragment that has not been dephosphorylated. 

"Ligation" refers to the process of forming phosphodiester bonds between two 
double stranded nucleic acid fragments (Maniatis, T., et al., Id,, p. 146). Unless otherwise 
provided, ligation may be accomplished using known buffers and conditions with 10 units 
of T4 DNA ligase ("ligase") per 0.5 ug of approximately equimolar amounts of the DNA 
fragments to be ligated. 
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Unless otherwise stated, transformation was performed as described in the method 
of Graham, F. and Van der Eb, A., Virology, 52:456-457 (1973). 

Example 1 

Bacterial Expression and Purification of Glvcosidase Enzvmes 
DNA encoding the enzymes of the present invention, SEQ ID NOS: 1 -14 and 57-60 
were initially amplified from a pBluescript vector containing the DNA by the PGR 
technique using the primers noted herein. The amplified sequences were then inserted into 
the respective PQE vector listed beneath the primer sequences, and the enzyme was 
expressed according to the protocols set forth herein. The 5* and 3' primer sequences for 
the respective genes are as follows: 

Thermococcus AEDII 1 2RA - 1 8B/G 

5' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGGTGAATGCTATGATTGTC 3' (SEQ ID NO:29) 
3* CGG AAG ATCTTCATAGCTCC GG AAGCCCATA 5' (SEQ ID NO:30) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' Big 
IL 

OC1/4V-33B/G 

5' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGATAAGAAGGTCCGATTTTCC 3' 
(SEQ IDNO:31) 

3* CG G AAG ATCTTTAAG ATTTTAGAAATTCCTT 5 1 (SEQ ID NO:32) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' Bgl 
IL 

Thermococcus 9N2 - 31B/G 

5' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGCTACCAGAAGGCTTTCTC 3 1 
(SEQ IDNO:33) 

3' CGGAGGTACCTCACCCAAGTCCGAACTTCTC 5' (SEQ ID N0.34) 

Vector: pQE30; and contains the following restriction enzyme sites 5* EcoRI and 3' 
fCpnl. 
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Staphylothermus mahnusYX - 12G 

5' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGATAAGGTTTCCTGATTAT 3' 
(SEQ ID NO:35) 

3' CGGAAGATCTTTATTCGAGGTTCTTTAATCC 5* (SEQ ID NO:36) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3* Bgl 
II. 

Thermococcus chitonophagus GC 74 - 22G 

5' CCGAGAATTCATTCATTAAAGAGGAGAAATTAACTATGCTTCCAGGAGAACTTTCTC 3* 
(SEQ ID NO:37) 

3* CGGAGGATCCCTACCCCTCCTCTAAGATCTC 5' (SEQ IDNO:38) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3* 
BamHI. 

M11TL 

5' AATAATCTAGAGCATGCAATTCCCCAAAGACTTCATGATAG 3' (SEQ ID NO:39) 
3' AAT AAAAGCTT ACTGG ATC AGTGTAAG ATGCT 5' (SEQ ID NO:40) 

Vector: pQE70; and contains the following restriction enzyme sites 5' SphI and 3' Hind 

m. 

Thermotoga maritima MSB8-6G 

5' CCGACAATTGATTAAAGAGGAGAAATTAACTATGGAAAGGATCGATGAAATT 3' (SEQ ID NO:4 1 ) 
3' CGGAGGTACCTCATGGTTTGAATCTCTTCTC 5' (SEQ ID NO:42) 

Vector: pQEl 2; and contains the following restriction enzyme sites 5' EcoRI and 3' 
KpnI. 

Pyrococcus furiosus VC I - 7G 1 

5' CCGACAATTGATTAAAGAGGAGAAATTAACTATGTTCCCTGAAAAGTTCCTT 3' (SEQ ID NO:43) 
3' CGGAGGTACCTCATCCCCTCAGC AATTCCTC 5* (SEQ ID NO:44) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' Kpn 
I. 
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Bankia gouldi endoglucanase (3 7GP 1 ) 

5' AATAAGGATCCGTTTAGCGACGCTCGC 3' (SEQ ID NO:45) 

3' AATAAAAGCTTCCGGGTTGTACAGCGGTAATAGGC 5* (SEQ ID NO:46) 

Vector: pQE52; and contains the following restriction enzyme sites 5' Bam HI and 3* 
Hind UI. 

Thermotoga maritima cc-galactosidase (6GC2) 

5' TTTATTGAATTCATTAAAGAGGAGAAATTAACTATGATCTGTGTGGAAATATTCGGAAAG 3' 
(SEQ IDNO:47) 

3' TCTATAAAGCTTTCATTCTCTCTCACCCTCTTCGTAGAAG 5' (SEQ ID NO:48) 

Vector: pQET; and contains the following restriction enzyme sites 5' EcoRI and 3' Hind 
III. 

Thermotoga maritima fl-mannanase (6GP2) 

riTATrc ^ TTGATTA ^ GA GGAGAAATTAACTATGGGGATTGGTGGCGACGAC 3' 
(SEQ IDNCM9) 

3* TTTATTAAGCTTATCTTTTCATATTCACATACCTCC 5' (SEQ ID NO:50) 

Vector: pQEt; and contains the following restriction enzyme sites 5' Hind III and 3' 
EcoRI. 

AEPIIla B-mannanase(63GBl) 

5' TTTATTGAATTCATTAAAGAGGAGAAATTAACTATGCTACCAGAAGAGTTCCTATGGGGC 3' 
(SEQIDNO:5I) 

3* TTTATTAAGCTTCTCATCAACGGCTATGGTCTTCATTTC 5' (SEQ ID NO:52) 

Vector: pQEt; and contains the following restriction enzyme sites 5' Hind III and 3' 
EcoRI. 

OC1/4V endoglucanase (33GP1) 

5' AAAAAACAATTGAATTCATTAAAGAGGAGAAATTAACTATGGTAGAAAGACACTTCAGATATGTTCTT 
3* (SEQIDN0:53) 

y TTTTTCGGATCCAATTCTTCATTTACTCTTTGCCTG 5' (SEQ ID NO:54) 
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Vector: pQEt; and contains the following restriction enzyme sites 5' BamHI and 3' 
EcoRI. 

Thermoioga maritima pullalanase (6GP3) 

5' TTTTGGAATTCATTAAAGAGGAGAAATTAACTATGGAACTGATCATAGAAGGTTAC 3' 
(SEQ IDNO:55) 

3' ATAAGAAGCTnTCACTCTCTGTACAGAACGTACGC 5' {SEQ ID NO: 56) 

Vector: pQEt; and contains the following restriction enzyme sites 5' EcoRI and 3' Hind 

in. 

The restriction enzyme sites indicated correspond to the restriction enzyme sites on 
the bacterial expression vector indicated for the respective gene (Qiagen, Inc. Chatsworth, 
CA). The pQE vector encodes antibiotic resistance (Amp 1 ), a bacterial origin of replication 
(ori), an IPTG-regulatable promoter operator (P/O), a ribosome binding site (RBS), a 6-His 
tag and restriction enzyme sites. 

The pQE vector was digested with the restriction enzymes indicated. The amplified 
sequences were ligated into the respective pQE vector and inserted in frame with the 
sequence encoding for the RBS. The ligation mixture was then used to transform the E. coli 
strain M15/pREP4 (Qiagen, Inc.) by electroporation. M15/pREP4 contains multiple copies 
of the plasmid pREP4, which expresses the lad repressor and also confers kanamycin 
resistance (KanO- Transformants were identified by their ability to grow on LB plates and 
ampicillin/kanamycin resistant colonies were selected. Plasmid DNA was isolated and 
confirmed by restriction analysis. Clones containing the desired constructs were grown 
overnight (O/N) in liquid culture in LB media supplemented with both Amp (100 ug/rnl) 
and Kan (25 ug/ml). The O/N culture was used to inoculate a large culture at a ratio of 
1 : 100 to 1 :250. The cells were grown to an optical density 600 (O.D. 600 ) of between 0.4 and 
0.6. IPTG ("Isopropyl-B-D-thiogalacto pyranoside") was then added to a final 
concentration of 1 mM. IPTG induces by inactivating the lad repressor, clearing the P/O 
leading to increased gene expression. Cells were grown an extra 3 to 4 hours. Cells were 
then harvested by centrifugation. 
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The primer sequences set out above may also be employed to isolate the target gene 
from the deposited material by hybridization techniques described above. 

Example 2 

Isolation of A Selected Clone From the Deposited geno mic clones 

A clone is isolated directly by screening the deposited material using the 
oligonucleotide primers set forth in Example 1 for the particular gene desired to be 
isolated. The specific oligonucleotides are synthesized using an Applied Biosystems 
DNA synthesizer. The oligonucleotides are labeled with 32 P- -ATP using T4 
polynucleotide kinase and purified according to a standard protocol (Maniatis et al., 
Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring, NY, 
1982). The deposited clones in the pBluescript vectors may be employed to transform 
bacterial hosts which are then plated on 1 .5% agar plates to the density of 20,000- 
50,000 pfu/1 50 mm plate. These plates are screened using Nylon membranes according 
to the standard screening protocol (Stratagene, 1993). Specifically, the Nylon 
membrane with denatured and fixed DNA is prehybridized in 6 x SSC, 20 mM 
NaH,P0 4 , 0.4%SDS, 5 x Denhardt's 500 ug/ml denatured, sonicated salmon sperm 
DNA; and 6 x SSC, 0.1% SDS. After one hour of prehybridization, the membrane is 
hybridized with hybridization buffer 6xSSC, 20 mM NaH 2 P0 4 , 0.4%SDS f 500 ug/ml 
denatured, sonicated salmon sperm DNA with lxlO 6 cpm/ml 32 P-probe overnight at 
42°C. The membrane is washed at 45-50°C with washing buffer 6 x SSC, 0.1% SDS 
for 20-30 minutes dried and exposed to Kodak X-ray film overnight. Positive clones are 
isolated and purified by secondary and tertiary screening. The purified clone is 
sequenced to verify its identity to the primer sequence. 

Once the clone is isolated, the two oligonucleotide primers corresponding to the 
gene of interest are used to amplify the gene from the deposited material. A polymerase 
chain reaction is carried out in 25 ul of reaction mixture with 0.5 ug of the DNA of the 
gene of interest. The reaction mixture is 1.5-5 mM MgCU, 0.01% (w/v) gelatin, 20 uM 
each of dATP, dCTP, dGTP, dTTP, 25 pmol of each primer and 0.25 Unit of Taq 
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polymerase. Thirty five cycles of PCR (denaturation at 94 °C for 1 min; annealing at 
55 °C for I min; elongation at 72° C for 1 min) are performed with the Perkin-Elmer 
Cetus automated thermal cycler. The amplified product is analyzed by agarose gel 
electrophoresis and the DNA band with expected molecular weight is excised and 
purified. The PCR product is verified to be the gene of interest by subcloning and 
sequencing the DNA product. The ends of the newly purified genes are nucleotide 
sequenced to identify full length sequences. Complete sequencing of full length genes is 
then performed by Exonuciease III digestion or primer walking. 

Example 3 
Screening for Galactosidase Activity 

Screening.procedures for a-galactosidase protein activity may be assayed for as 
follows: 

Substrate plates were provided by a standard plating procedure. Dilute XL1- 
Blue MRF E coli host of (Stratagene Cloning Systems, La Jolla, CA) to O.D. 600 = 1 .0 
with NZY media. In 15 ml tubes, inoculate 200 ;4 diluted host cells with phage. Mix 
gently and incubate tubes at 37 °C for 15 min. Add approximately 3.5 ml LB top 
agarose (0.1%) containing ImM IPTG to each tube and pour onto all NYZ plate surface. 
Allow to cool and incubate at 37 °C overnight. The assay plates are obtained as 
substrate p-Nitrophenyi a-galactosidase (Sigma) (200 mg/100 ml) (100 mM NaCl, 100 
mM Potassium-Phosphate) 1% (w/v) agarose. The plaques are overlayed with 
nitrocellulose and incubated at 4 °C for 30 minutes whereupon the nitrocellulose is 
removed and overlayed onto the substrate plates. The substrate plates are then incubated 
at 70 °Cfor20 minutes. 
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Example 4 

Screening of Clones for Mannanase Activity 

A solid phase screening assay was utilized as a primary screening method to test 
clones for B-mannanase activity. 

A culture solution of the Y1090-E coli host strain (Stratagene Cloning Systems, 
La Jolla, CA) was diluted to O.D. 600 =1.0 withNZY media. The amplified library from 
Thermotoga maritima lambda gtll library was diluted in SM (phage dilution buffer): 5 
x 10 7 pfu/ul diluted 1:1000 then 1:100 to 5 x 10 2 pfu/ul. Then 8 ul of phage dilution 
(5 x 10 2 pfu/ul) was plated in 200 ul host cells. They were then incubated in 1 5 ml 
tubes at 37 °C for 1 5 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at approximately 52 °C 
was added to each tube and the mixture was poured onto the surface of LB agar plates. 
The agar plates were then incubated at 37 °C for five hours. The plates were replicated 
and induced with 10 mM IPTG-soaked Duralon-UV™ nylon membranes (Stratagene 
Cloning Systems, La Jolla, C A) overnight. The nylon membranes and plates were 
marked with a needle to keep their orientation and the nylon membranes were then 
removed and stored at 4 °C. 

An Azo-galactomannan overlay was applied to the LB plates containing the 
lambda plaques. The overlay contains 1% agarose, 50 mM potassium-phosphate buffer 
pH 7, 0.4% Azocarob-galactomannan. (Megazyme, Australia). The plates were 
incubated at 72 °C. The Azocarob-galactomannan treated plates were observed after 4 
hours then returned to incubation overnight. Putative positives were identified by 
clearing zones on the Azocarob-galactomannan plates. Two positive clones were 
observed. 

The nylon membranes referred to above, which correspond to the positive clones 
were retrieved, oriented over the plate and the portions matching the locations of the 
clearing zones for positive clones wre cut out. Phage was eluted from the membrane 
cut-out portions by soaking the individual portions in 500 ul SM (phage dilution buffer) 
and 25 ul CHC1 3 . 
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Example 5 

Screening of Clones for Mannosid ase Activity 

A solid phase screening assay was utilized as a primary screening method to test 
clones for B-mannosidase activity. 

A culture solution of the Y1090-E coli host strain (Stratagene Cloning Systems, 
La Jolla, CA) was diluted to O.D. 600 =LO with NZY media. The amplified library from 
AEPII la lambda gtll library was diluted in SM (phage dilution buffer): 5 x 10 7 pfu/ul 
diluted 1:1000 then 1:100 to 5 x 10 2 pfu/ul. Then 8 ul of phage dilution 
(5 x 10 : pfVul) was plated in 200 ul host ceils. They were then incubated in 15 ml 
tubes at 37 °C for 1 5 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at approximately 52 °C 
was added to each tube and the mixture was poured onto the surface of LB agar plates. 
The agar plates were then incubated at 37 °C for five hours. The plates were replicated 
and induced with 10 xnM IPTG-soaked Duralon-UV™ nylon membranes (Stratagene 
Cloning Systems, La Jolla, CA) overnight. The nylon membranes and plates were 
marked with a needle to keep their orientation and the nylon membranes were then 
removed and stored at 4 °C. 

-A p-nitrophenyl-B-D-manno-pyranoside overlay was applied to the LB plates 
containing the lambda plaques. The overlay contains 1% agarose, 50 mM potassium- 
phosphate buffer pH 7, 0.4% p-nitrophenyl-i5-D-manno-pyranoside. (Megazyme, 
Australia). The plates were incubated at 72 °C. The p-nitrophenyl-B-D-manno- 
pyranoside treated plates were observed after 4 hours then returned to incubation 
overnight. Putative positives were identified by clearing zones on the p-nitrophenyl-B- 
D-manno-pyranoside plates. Two positive clones were observed. 

The nylon membranes referred to above, which correspond to the positive clones 
were retrieved, oriented over the plate and the portions matching the locations of the 
clearing zones for positive clones wre cut out. Phage was eluted from the membrane 
cut-out portions by soaking the individual portions in 500 ul SM (phage dilution buffer) 
and 25 ul CHC1 3 . 
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Example 6 
Screening for Pullulana.se Activity 

Screening procedures for pullulanase protein activity may be assayed for as 
follows: 

Substrate plates were provided by a standard plating procedure. Host cells are 
diluted to O.D. 6fl0 = 1 .0 with NZY or appropriate media. In 1 5 ml tubes, inoculate 200 
}A diluted host cells with phage. Mix gently and incubate tubes at 37 °C for 15 min. 
Add approximately 3.5 ml LB top agarose (0.7%) is added to each tube and the mixture 
is plated, allowed to cool, and incubated at 37°C for about 28 hours. Overlays of 4.5 
mis of the following substrate are poured: 

1 00 ml total volume 



0.5g Red Pullulan Red (Megazyme, Australia) 

1 .0g Agarose 

5ml Buffer (Tris-HCL pH 7.2 @ 75 °C) 

2ml 5MNaCl 

5ml CaCK(lOOmM) 

85ml dH 2 0 



Plates are cooled at room temperature, and thenm incubated at 75 °C for 2 hours. 
Positives are observed as showing substrate degradation. 

Example 7 
Screening for Endoglucanase Activity 

Screening procedures for endoglucanase protein activity may be assayed for as 
follows: 

1 . The gene library is plated onto 6 LB/GelRite/0. 1 % CMC/NZY agar plates 
(-4.800 plaque forming units/plate) in E.coli host with LB agarose as top agarose. The 
plates are incubated at 37 °C overnight. 
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2. Plates are chilled at 4°C for one hour. 

3 . The plates are overlayed with Duralon membranes (Stratagene) at room 
temperature for one hour and the membranes are oriented and lifted off the plates and 
stored at 4°C 

4. The top agarose layer is removed and plates are incubated at 37°C for -3 

hours. 

5 . The plate surface is rinsed with NaCl. 

6. The plate is stained with 0. 1 % Congo Red for 1 5 minutes. 

7. The plate is destained with 1M NaCl. 

8. The putative positives identified on plate are isolated from the Duralon 
membrane (positives are identified by clearing zones around clones). The phage is 
eluted from the membrane by incubating in 500ul SM + 25ul CHC1 3 to elute. 

9. Insert DNA is subcloned into any appropriate cloning vector and 
subclones are reassayed for CMCase activity using the following protocol: 

i) Spin 1 ml overnight miniprep of clone at maximum speed for 3 

minutes. 

ii) Decant the supernatant and use it to fill "wells" that have been 
made in an LB/GelRite/0.1% CMC plate. 

iii) Incubate at 37°C for 2 hours. 

tv) Stain with 0.1% Congo Red for 1 5 minutes. 

v) Destain with 1 M NaCl for 1 5 minutes. 

vi) Identify positives by clearing zone around clone. 

Numerous modifications and variations of the present invention are possible in 
light of the above teachings and, therefore, within the scope of the appended claims, the 
invention may be practiced otherwise than as particularly described. 
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WHAT IS CLAIMED IS : 

1 . An isolated polynucleotide selected from the group consisting of: 

(a) SEQIDNOS: 1-14 and 57-60; 

(b) SEQ ID NOS: 1-14 and 57-60, wherein T can also be U; 

(c) polynucleotide sequences complementary to SEQ ID NOS: 1-14 and 57- 
60; 

(d) polynucleotide sequences which encode an amino acid sequence as set 
forth in SEQ ID NOS:15-28, and 61-64; and 

(e) fragments of (a), (b), (c) or (d) that are at least 1 5 consecutive bases in 
length and that will selectively hybridize to DNA which encodes a 
polypeptide of SEQ ID NOS:15-28, and 61-64. 

2. A vector comprising a polynucleotide of claim 1 . 

3. A host cell containing the vector of claim 2. 

4. The method of claim 3, wherein the host cell is a eukaryotic cell. 

5. The method of claim 3, wherein the host cell is a prokaryotic cell. 

6. A method for producing a polypeptide comprising: 

(a) culturing the host cells of claim 3; 

(b) expressing from the host cell of claim 3 a polypeptide encoded by said 
polynucleotide; and 

(c) isolating the polypeptide. 



40 



WO 98/24799 



PCT/US97/22623 



enzyme selected from the group consisting of: 

an enzyme comprising an amino acid sequence set forth in SEQ ID NOS: 
15-28 or 6 1-64; and 

an enzyme which comprises at least 30 consecutive amino acid residue as 
an enzyme of (a). 

8. An enzyme of which at least a portion is coded for by a polynucleotide of 
claim 1, and which is selected from the group consisting of; 

(a) an enzyme comprising an amino acid sequence which is at least 70% 
identical to an amino acid sequence selected from the group of amino 
acid sequences set forth in SEQ ID NOS: 15-28 or 61-64; and 

(b) an enzyme which comprises at least 30 amino acid residues to the 
enzyme of (a). 

9. A method for generating glucose from soluble cell oligosaccharides comprising 
contacting a sample containing oligosaccharides with an effective amount of an 
enyzme selected from the group consisting of an enzyme having the amino acid 
sequence set forth in SEQ ID NOS: 15-28, 61-63 and 64 such that glucose is 
produced. 

10. The method of cliam 9, wherein the sample is selected from the group consisting 
of dairy products, fruit juices, detergents, textiles, guar gum, animal feed, plant 
biomass and waste products. 

11. The method of claim 9, wherein the oligosaccharide is selected from the group 
consisting of maltose, cellobiose, lactose, sucrose, raffinose, stachyose, 
verbascose, cellulose, starch, amylose, glycogen, disacharrides, polysacharrides 
and pullulan. 



41 



WO 98/24799 PCT/US97/22623 

1/46 

M11TL CLYCOSIDASE - 2 $G 
COMPLETE CENE SEQUENCE - 9/95 

1 rn: aaa .-,r , rr aaa «;ai' -rtv at., ata ta. t. a T,- r T( . A 

' M,, \ ■•>- a*,. ,.i„. Mi-i ,:, v Tv , .„* ;;;;; ™; ™ ... 

" = r 5 *" ™ = ~ ■"• = = = = = S s s „- = :• 



240 
60 



300 
100 



360 
120 



':: 2 2 2 2 2 2 ™ 2 % ™ - - - - - «,, ,« 

P A*p !.«., AM c.lu Ly« Leu Cly Val Asn Thr Il„ Arr, Val Cly 

J ii - - 2 2 2 2 2 2 - 2 S « «; « - « «, « « „ 

;s s k 2 2 2 2 2 2 2 2 2 2 = - - - - ~ z z 

" ,1 r 2 = - - = = = 2 = - 2 2 2 SI - 2 5 2 2 

si 2 s 2 2 = 2 2 2 2 2 2 2 2 2 - 2 - = ~ - 
:s 2 2 ^ 2 2 - - « - 2 2 2 s 2 2 5 - - - - 
JJi 2 2 2 2 ee^ 2 2 2 se;: in 2 2 2 2 

—2222^222222222222222 
JS ISS222222222222222222 
IS -2222222222222222222 
IS 2 Si 2 2 2 2222 2 2 2 2222 2 2 2 2 
Si EE£ = 2£2222222S2S2222 

K -2222222222222222222 
---"-££-222222222222 
"S = = S:2222222222222S222 
*5 2 2 2 2 2 2 2 2 2 2222 2 2 2 2 2 2 2 
l iS 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 
" 222222222222 2" 



460 

160 



$40 
130 



600 
200 



660 
220 



720 
240 



780 
260 



840 
280 



900 
300 



960 
320 



1020 
340 



1060 
360 



1 140 
380 



1700 
40O 



tac en- < a.- tu; aut rrt; aca uai- aat tac cai: t.>: ar ,a»: *;r;r m- a,-,. 

4x1 r " u - - > ■»» - r*< t„, a,.. ^ z: z 



Figure la. 



WO 98/24799 



2/46 



PCT/US97/22623 



I t.'i .;«;t ttw m-r ati; 
4-11 .:i v t...„ v., I H~t 

1 mi -m- w t tyu; ATr 
!'»».■ Aim iMii M.. 

H4I I'AUTAA 1446 
liln End 482 



err .:a» rn aa/, a.t aa.; 

VAl A.-.,. M*. ,. v .. T ,„ )-y| 

WA An; . AT AA. .^yv ATA 
Al* Tin His Ar:it i.Mv I !•■ 



AAA A.a; TAT i-ri- ,„., ,. ) A 
'■V« Ty, A((| t . ( f 

,,T: '* T ' TA .A.: .'at 

A*„ ( ;i„ M( [ 



a* •>- *•'''" '*ta im. i titii 

Al.i l,tn, V.i I 

t-rr ai*a en; ati * M-it' 

J.1-.J .T»it i^-.i iii> Jim 



Figure It (Continued) 



i 



WO 98/24799 PCT/US9 7/22623 

3/46 

COMPLETE CENT 3EQ0EKCE - 9/95 
1 An; ATA ACA ACC tcv* 

, ,' " " = - - - = = £ S s s r,i ~ - 

#, » ' AC ATT CAA OCT CCA yr 



60 
20 



"»» CCA TAC AAC CAA CAT AT, , " ^ 



IflO 
60 



"I ATC TCC TCC CCC ACA ^ ^ ?h ' 5 " 



260 
120 



301 TAC AAC ACA CTC CTT P Ph ' 



420 

HO 



540 
180 



- . * AA * Hi * **« L «" L€u Ajg Ciu 200 



«« CAT CCA CAT CCT _ „ ^ Ciu 300 

- -====a==S5=s=s=E =ss - - 

5 =K==2=55aS552S!-- = -- s . 



780 
260 



640 

ISO 



» s == = ==S ==SSS5 =S -- = - r - 



'01 TTT GA 300 

... ~-=sr5s S=s - s -- S2: « ;;; „ s . ... 

... «=ss=s=s==s== a --:-: S:5 ■•„•• 
... = SESsssssssr-s---:;.-;. ... 

a = s s==sr===s==K= == s s = 
-=====s;5= = ; S -- s: - - - - _ - ■ 



ioao 



1140 
380 



1200 
4 00 



1260 
<1!0 



Figure 2 



WO 98/24799 PCIYUS97/22623 

4/46 

STAPHYLODERMA KAJIXNU3 CLYCOSI DASE - 12c 
COMPLETE GENE SEQUENCE 
9/95 

1 TTC ATA ACC TTT CCT CAT TAT Trr- — .. — 

... ... ., - ... ~ = = - -■; s n; = - ~ z «. .. 



S = = = 5 = S = = = - - - - K - - - - ™ z 



i Het Al« ciu 

ACA AAA CAT 

- 41 ... " Pro Lys Asp 

.. s = = s sszs z s s s s = = = = - - 



360 
120 



420 
140 



«8 Lys Tyr 

... ===========5^5---:-:-;- 

s sssasssssaEsss------ 

K = S £ s S ~ - - ~ s „ « 
==== = 5S = = srsS!-ss- = = - _ 

■s s = =ss=s==e ;i = - = si s = = = = 



I 20 
40 



340 

80 



300 
100 



600 
200 



660 
230 



720 
240 



780 
260 



840 

280 



900 
300 



960 
120 



1020 
140 



1030 
3(0 



1200 
400 



CAA TAA 1266 
«21 ciu E*u} 422 



1260 
420 



Figure Z 



WO 98/24799 PCT/US97/22623 

5/46 

Com»;,.«e# e*ne »«i)ui>nc« 

I *TG CTA CCA CAA ftGC TTT .. 

,: r :: z - - ~ = = s " = = = = = = = = = s s 



fil C*C AAC CTC AOG AM AAC i-r 

.:: :rrr:r::ff ========== . 



■S =====SSS=SSS =5 -----2 

2<I CCA ATA CAC T« AGC AOS *~ 



340 
90 



300 
100 



420 
140 



- -u u. «. v . « « - « - « n. „ 2 

- 5S52SSS52!S5 5 2- 5 . S = 2 = 

"S "S = ££SS:2E£5S5 = = = 2 = S 
»J 5 S 5 5 2 2 £ 5 £ - « ~ - = - £ 

•S! £S2:22S22E2222£2222S2 



fffiO 
320 



720 
344 



?80 
360 



110 

230 



90O 
300 



960 
S20 



1030 
340 



1.0 80 
160 



1140 
380 



1200 
400 



1240 
430 



1320 
440 



Figure 4a 



WO 98/24799 6 / 46 PCTYUS97/22623 



2 Z Z « ~ « « «. - - = ff TO ^ _ 

•«1 CTC CCT TTC AOC Are ACG TTr rjv* ^* ^ Xi * "« 

- -^-^ J^252^-«; =; 2 :;5 « rr ^« .... 

- - «- ,„ - £ - :;; - - « - - - - T ^ w wc 



ciu u, at, ^ 1550 



Figure 4 \>< Continued) 



WO 98/24799 



7/46 



PCT/US97/22623 



I ATC CAA A(It; ATC CAT CAA ATT CTC T*~T ( AC TTA AH ACA OAC CAA AA(i (,T(i AA<i CTC 
t Mcl Clu Arp I Ir Am Clu Ik Uo Ser Cln Uu Thf Thr Clu <;j», v»l l.cu 



crc (jrc cct 

Val Ala CI J 



6i ctc ccc frrr cct ctt cca cca ctt ttt ccc aac cca cat tcc aca 

:i V.I Gly Val c;iy l.cu Pm Gly Uu PMc Cly Am Pr.. Hii Scr Af* 

111 CCA CAA ACA CAT CCC CTT CCA ACA CTT CCA ATT CCT CCC TTT CTC CTC CCA CAT CCT CCC 

41 Cly Clu Th, Hi* Pm v,| Arg l,. u a , , le p,,, AU pt)e v>( Lcu ^ ^ ^ ^ 

111 CCA CCA CTC ACA ATA AAT CCC ACA ACG CAA AAC CAT CAA AAC ACT TAC TAC ACG ACC CCA 

61 AU Cly Uu Ar, lie Am Pm Thr Ar, Clu A« Axp Clu Am Thr Tyr Tyr Th, Thr All 

W IT 7 "° ^ ? AA ATC AT ° ^ CCT TCT ACC TCC AAC ACA CAC CTT CTC CAA CAA CTC CCA 

*° V *' C,u lle Mct Uu AU Scr Th/ Tfp Am Ar f A» Uu Uu Clu Clu Vtl Cly 

301 AAA CCC ATC CCA CAA CAA CTT ACC CAA TAC CCT CTC CAT CTO CTT CTT CCA CCT CCC ATG 

101 Ly. AU M« G.y C.u C.» Va, Arg Clu Tyr Cly V.| Am Vai Uu L« AU pT AU 

AAC ATT CAC ACA AAC CCT CTT TCT GGA ACC AAT TTC CAC TAC TAC TCA CAA CAT CCT CTC 

Am lie Hil Ar, Am Pro Uu Cy» Cly Ar, An Phe Clu Tyr Tyr Scr Clu Am Pro V«l 

421 CTT TCC CCT CAA ATG CCT TCA CCC TTT CTC AAC GGA CTT CAA TCT CAA CCC CTC CCA CCC 

Ul Uu Ser Cly Glu M« AU Scr AU Phc V.I Ly. Cly V.| Cln Set CI. Cly V.t 0* AU 

m Ef ^ ^ 17 ^ AAC AAC CAC CAA ACG AAC ACG ATG OTA CTC CAC ACC ATC 

161 Cy« tic Ly» H« Phe Vij aU Am Am Cln Clu Thr Am Ar, Met V.| v.| am Thr lie 

341 CTC TCC CAC CCA CCC CTC ACA CAA ATA TAT CTC AAA CCT TTT GAA ATT CCT CTC AAC AAA 

Ut VU Ser Clu Ar, AU Uu A,, C.u l,c Tyr Uu Ly. Cly J27 ^ JT ST ™ £ L^ 

601 CCA ACA CCC TOO ACC CTC ATG ACC CCT TAC AAC AAA CTC AAT GGA AAA TAC TCT TCA CAC 

Ml Ala Ar, Pn, Trp Thr Vtl Met Scr AU T,r A« Lv* L™ a.- ,„ T „ JL„ . ^ 



361 
121 



660 
220 



661 
221 


AAC CAA 
Am Clu 


TCC CTT 
Trp Uu 


TTC 
Uu 


AAC 
Lyi 


AAG CTT 
Lyi V«j 


CTC 
Uu 


ACC 

AT, 


GAA 
Clu 


CAA 
Clu 


TOG 
Trp 


CCA 

Cly 


TTT 
Phe 


CCC 

Cly 


CCT 
Gly 


TTC 
Phc 


CTC 
Val 


ATG 
Met 


720 
240 


T2I 

241 


ACC CAC 
Scr Am 


TCC TAC 
Trp Tyr 


CCC 
AU 


CCA 
Cly 


CAC AAC 
An Am 


CCT 
Pro 


CTA 
VaJ 


CAA 
Clu 


CAC 
Cln 


CTC 
Uu 


AAG 
Lys 


CCC 
AU 


GGA 
Cly 


AAC 
Am 


CAT 
A» 


ATG 
Met 


ATC 
11c 


750 

260 


7SI 
261 


ATC CCT 
Mei Pro 


GGO AAA 
Cly Ly* 


CCG 
AU 


TAT 
Tyr 


CAC CTC 
Cln Val 


AAC 
Am 


ACA 

Thr 


CAA 

Clu 


ACA 
Ar, 


ACA 
Ar» 


GAT 
Am 


CAA 

Clu 


ATA 
lie 


CAA 
Clu 


CAA 

Clu 


ATC 
lie 


ATG 

Met 


t40 

250 


MI 

:si 


CAC CCC 
Clu AU 


TTC AAC 
Uu Lyi 


CAC 
Clu 


CCA 
Cly 


AAA TTC 
Lyt Uu 


ACT 
Scr 


CAC 
Clu 


CAC 
Clu 


CTT 
Val 


CTC 
Uu 


CAT 

AM 


GAG 
Glu 


TCT 
Cys 


CTC 
Val 


ACA 
Ar, 


AAC 
Am 


ATT 
He 


900 
300 


Ml 
301 


CTC AAA 
Uu Lyi 


CTT CTT 
Val Uu 


CTC 
Val 


AAC 
Am 


CCC CCT 
AU Pm 


TCC 
Scr 


TTC 
Phc 


AAA 
Lyi 


CCC 
Cly 


TAC 
Tyr 


AGC 
Ar, 


TAC 
Tyr 


TCA 
Scr 


AAC 
Am 


AAC 
Lyi 


CCC 
Pro 


GAT 
Am 


960 
320 


961 
321 


CTC GAA 
Uu Glu 


TCT CAC 
Scr Hii 


CCC 
AU 


CAA 
Clu 


CTC CCC 
Vil AU 


TAC 
Ty. 


GAA 

Glu 


CCA 
Alt 


CCT 
Gly 


CCC 
Ala 


GAG 
Clu 


CCT 
Cly 


CTT 
Vil 


CTC 
Val 


CTT 

Uu 


CTT 
Uu 


GAG 
Clu 


1020 
340 


1021 
J4I 


AAC AAC 
Am Am 


CCT CTT 

Gly Vul 


CTT 
Liu 


CCC 

pf.. 


TTC CAT 
Phc Am 


CAA 

Clu 


AAT 
Ain 


ACC 
Thi 


CAT 
11* 


CTC 
V»i 


CCC 
AU 


CTC 
v»j 


TTT 
Phc 


GCC 
Cly 


ACC 
Thr 


CCT 
Cly 


CAA 
Cm 


I0SO 
360 


101 1 
3M 


ATC CAA 

He Ctu 


ACA ATA 

Thr Ik- 


AAC 

l r * 


CCA 

a, 


CGA ACG 
Cly Thr 


CCA 
Cly 


ACT 
Scr 


CCA 
Cly 


CAC 
Av 


ACC 
Tlir 


CAT 
Ihi 


CCG 
Pro 


ACA 
Ar, 


TAC 
Tyr 


ACC 
Thr 


ATC 
lit- 


TCT 
Scr 


1140 
3X0 


II4| 
.1*1 


ATC CTT 
He Uu 


OA A tier 
Clu <;iy 


ATA 
Ik 


AAA 
l.y» 


CAA ACA 
Clu Ar, 


AAC 

Ann 


ATC 
Mci 


aac; 

l.yx 


rrc 

Phc 


CAC 
A H . 


CAA 
Clu 


GAA 

Clu 


CTC 
Uu 


ccr 

Ala 


TCC 
5ci 


ACT 
Tin 


TAT 

Tyr 


UU) 

-too 



Figure :.5£L 



WO 98/24799 PCT/US97/22623 

6/46 



I»l CAC CAC TAC ATA AAA A A (J ATfi ACA «AA ACA I'aV t*t 

- - T „ , L ,. LP M „ A ,r r r ^ sr ,;r ;;r r :: r 
r stk- ^ L r ? crsr s* s- r jr r sr ;,r sr r - 



1321 CCT CCA AAG AAA AAC CAT GTT GCa rrrr r-rr i-rr .« — 

«. p.. - u„ L , ^ „ »72* - £7 s° ;: c :,r :r ™ 

1311 CAC AGA AAG CCG GTG AAA CCT CAC TTC TAC rrr T-r , 1T „ ^ 

«. A. ... L,. - V., L ,. c „ ^ ™ ™ ™ ™ «T £C «c CTC r 

(441 ACC GTC TCG AAA GAA TTC CAC CAT CAG CCT aac mam ~- 

«. - v., s „ L ,. c. „u c ^ T ™ L ~° ™ « 27 S, 0 



44Q 



„ . - •» • - wwt aua A I > 

521 AJ* Cly Ctn GIb Met Cly Ar f tie 



--«- lit. I.LU AAli 

5<l Gly Lyi Leu Pro Thr TV Phc Pro Lyi 



GAG 
Glu 


CGA 

Cly 


TAC 
Tyr 


mo 

4M 


CTC 
Leu 


ATA 

lie 


AAA 
Lyi 


1440 
4X0 


AAC 
Am 


ATC 

He 


CCA 
Cly 


1500 
300 


CTC 
Vil 


TCG 
Trp 


CAG 
Gin 


1360 
320 


AAT 
Asa 


CCC 
Pro 


TCC 
Scr 


1620 
340 


ACG 
Thr 


TTC 
Phe 


CCA 
Pro 


1680 

560 


GTG 


GCA 


TAC 


1740 



Cly tie Leu Leu 

GTG CCC GAT GTT CTT GTG CGA AAG ATT 

V.I AU A* V.| L« V.I Cly Lyi tie 

CAT TAC TCG CAC GTT CCA TCC TGG 

A* Tyr Ser Asp V,| Pro 5cr Tr? 

2' T£ ys» - z « - - £ y T - 

sr r sr y 2? L r r sr r ^ y - ~ 

r ^ £ ^ ~- - « « - 0~ « TCA «. 

5' JT r r - ™ £• « - « - 



Phe 



641 - — - * - irsr c,r r; c if ™ r ^ ™ 

t* n*p ucu 

rir s T r r r « - - - - - - « «, ctc ... 

2141 CCA TCA 2166 
nt Pra End T22 



GGC 
Gly 


CTC 
Lai 


TCT 
Scr 


TAC 
Tyr 


1800 
600 


CTC 
Leu 


AGA 
A/, 


GTG 
VU 


TCG 
Scr 


I860 
620 


GTC 
V»| 


TAC 
Tyr 


ATC 
lie 


AAA 


1920 
640 


CAC 
His 


AAA 
Lyj 


ACA 
Thr 


AAA 

Ly» 


I9S0 
660 


AGA 
An 


GAT 
Asp 


CTT 

Leu 


CCC 
AU 


2040 
6*0 


AGO 

A fI 


CTC 
Vil 


GOT 
Gly 


GCA 
Ala 


2100 
700 


AAG 
Ly, 


AGA 

An 


TTC 
Phe 


AAA 
Lyi 


2160 
720 



Figure 5t( Continued) 



300 
100 



360 
120 



430 
140 



480 

ISO 



WO 98/24799 PCT/US97/22623 

9/46 

< «« - z z z z z z z z e a E * - v : ~ « - . 
s s = zz 2 s: s z s s? z z - « s z z z 

'X ™ Z = Z 5 5 Z S E Z 2 ~ « - « - - ~ - ~ 

z z z z z z s s s e z z s e = z z z z z 
z s z z z z z z z z z z z z z z z z z z z 
zzzzzzzzzzzzzzzzzzzz 
z zzzzzzzzzzzzzzzzzzzz 
z zszzzzzzzzzzzzzzzzzz 

zzzzzzzzzzzzzzzzzzzz z 
z zzzzzzzzzzzzzzzzzzzz 

Z ZZZZZZZZZZZZZZZZZZZZ 

z zzzzzzzzzzzzzzzzzzzz 

z zzzzzzzzzzzzzzzzzzzz 2 
;s zzzzzzzzzzzzzzzzzzzz 
z zzzzzzzzzzzzzzzzzzzz 
z zzzzzzzzzzzzzzzzzzzz 
z zzzzzzzzzzzzzzzzzzzz 

"ZZZZZZZZZZZZZZZZZZZZ 

~ zzzzzzzzzzzzzzzzzzzz 

l z zzzzzzzzzzzzzzzzzzzz 
l ": zzzzzzzzzzzzzzzzzzzz r 



20 
120 



600 
200 



660 

220 



720 
240 



840 
280 



900 
300 



960 
320 



1020 
340 



1080 
360 



1140 
380 



1200 
400 



Figure 6 



WO 98/24799 



10/46 



PCIYUS97/22623 



60 
20 



120 
40 



160 
60 



THERXOCOCCUS CHITONOPHACOS CLYCOSIDASE - 22G 
COMPLETE SEQUENCE - 9/95 

! =.= s = s = = s s S - - - - = - = « - - 

" s » = s s s :n = s s s x,» s sr is s s s: s 
•: = s s = k s = ssssEsrsrs s ;:: r = - 

".' SEsssasEssss-E---..-.. 
= ==ss=ssssss:sss==ss5a 



J00 
100 



360 
120 



420 
140 



460 
160 



541 GAA ACC ACT CTT ATA GAG — rr eei m 

» -^.— ».«.eeeeeeeeeeee5s e 



660 
220 



EEESSEEEEEEEEESSSi—s- 
■-EEEEEESESEESEEEESEEE 

E EESEESEEEEEEEESSEEEE r 
l "i SSEEEEE^KEESESEEEEES: r 



720 
240 



7S0 
260 



040 

260 



900 
300 



960 
320 



1141 
3B1 



1140 

3S0 



EEEEEEEEEEEES£S2£5=:s: r 
SI EESEEESEEEEEEEEEESZE ST 
r E E E E E E E E E E E E E E E E E E E ST 



1261 
421 



Figure 7a, 



WO 98/24799 PCTYUS97/22623 

'«! ^ c c c w ; a i a ca p c ™ e ™ r A m cac tgc *** ™ acc ^ ™ ™ 

ivr A,p val Arg Cly Tyr Ltu Mi. Trp *1. U u Thr A.p a.,. Tyr du Trp 



460 



'Si EE 5 S - £ « - ~ = - - - ~ s j»w » «. « r 

«i - s s £ £ s e e s 2 s ^ s -r s: s - s 2 - - 

'j!! * CC f* C J™ AGO AAA GAG ATC TTA CAC CAC CCC TAG 15J6 

501 s« A«n II. Aro Ly« clu lie L«u Clu Clu Cly End 512 



Figure 7b (Continued) 



WO 98/24799 



12/46 



PCT/US97/22623 



121 
41 



181 

61 



i S S £ SS £ k 2 ~ « « - « « « m ^ „ «. « « 

61 CAT AAA CTC *r.r . ° in Phe Clu Ciy 

« - - 25 E £ £ K 2S £ - - « * « «. « ™ „ WT „ c 

« ACA AAT ATA GAG AAA GCC CTC CTT ^ ^ **' *" P LyS 

- » ~ r 2 = n K: s = 2 = " - « a = s ,~ 

»• - ~ a k s a a S55S s a :n s s ,~ - 

■S 5 S S5S5 B ESSSSBES r - - - - 

«=s=sassaaa E sssBSr - - = 

»> - S SS £ £ E S £7 j£ £ £ 5- ™ ~ - CQ T „ TM M 

- «■ - S & E £ £■ 5 « K £ £ £ £ ~ C T ~ « « 

S «! «A ACA CTT ATA CAC ttt ,v. ly IrP Vl1 Aan 

»» - fc w s C ^ s s e « - e ss s s T ™ «• « « « 

«« ATA GTG GAT ATG Tec «*- „ Ly ' Phe G1 * Xs P 

■« .»»5S!!;sgSSS£5SS53r?-» 

a = a a a a a s a a a as a a a a ~ - - - 

781 AAA OCT gat ^ . Cln M» A^p Thr Clu 

». sssssEssrssssaassf?- 

8*1 CTT CCT tat c-r 7 Ty * Ajn H» CI y 

901 TTC TTC CAC TCA CCG C-c rrr ^ U A3 " ^ Aan 

r « r f f f f s: «ws=sss 8 a = 

•« -assfsaaasssssisaj'if'--""-. 
- - - «. k ^ ss en s 5n « « « - - - « « 



60 
20 

120 
40 

ieo 

60 

AO 

J00 
100 

360 
120 

140 

480 

160 

540 

180 

600 
200 

660 
220 

720 
240 

780 
260 

840 



1140 
360 



120C 
4 00 



1260 
420 



Figure Ba. 



WO 98/24799 



13/46 



PCT/US97/22623 



1261 
421 



U21 
441 



1381 
461 



1441 
461 



1501 
SOI 



AAC CAC ATC 
Lyj Asp U« 

GAG CAT GCC 
Clu Ajp Gly 

CCT CTC CCC 
Al« Leu Gly 

ATT CCC AGC 

U« Pro Arg 

AAA AAC ATT 
lya Lya lie 



CTA ACA CCT 

Leu Arg Pro 

TAX CAA CTT 
T/r Clu V*l 

TTT ACA ATC 
Phe Azg Met 

GAG AAC AGC 
Clu Lya Sex 

GAA GAG GAA 
ClU Glu Glu 



TAC TAC 
Tyr Tyr 

AAC GGC 
Lya Cly 

CCC TTT 
Arg Ph. 

GTG TCG 
V»l 5cr 

TTG CTC 
Leu Leu 



ATA CCC ACC CAC ATA AAG ATC ATA GAG AAC C« TT-r 

II. AH 5er Hi* He ty. Her 1U £S JJ? SS ST. }|"° 

TAC TTC CAC TCC CCA TTA ACT CAC AAC TTC CAC TCG 1JB0 

Tyx Ph. hi, Tr P Ali Leu Thr A*p A,n Ph. Clu Trp ill 

CCC CTC TAC GAA CTC AAC CTA ATT ACA AAC CAC ACA mo 

Cly Leu Tyr Clu VmI A^n Leu He Thr Ly, Clu aT, ill 

ATA TTC ACA CAC ATA CTA CCC AAT AAT CGT CTT ACC 150O 

II. Ph. Axg Clu lie VAl Al* A,n Ajn Tly v? X ?" JJjJ 0 



AGC CCA TCA 1533 
AT? Cly tnd 5U 



Figure Sbtcontinued) 



WO 98/24799 



14/46 



PCT/US97/22623 



9 IB 27 ■»* 

5' ATGACAiTi . J0 45 54 

C 

Val Thr 



=5='S2SS=S=5S2=S!-- £ 

117 U« US 

AGC CGA GCC CTT TAC OGC ATC AAT AAC TCC AAC CCA CJL& /Lm 162 

X71 180 US is. 



333 34 ^ IS! 

CAG GAA AAC CTG CCC CGC GCC GAC xrr »ty-» tv»,~ rit, 369 378 



JS» 405 iii 

GTC GOQ QCQ ACT TCT GCC TAC AAC TTT AAC GW w *-» *" 432 

TCG ACC GCC QTC GCT CAG AAT CTC GCT GOT Crtr nm «« ™ 486 

GGC GGC GGC GAA GCG CTG GTT GAA GGA GAC CCC far rw- 531 540 

«» «* «* ^ Val ~S~SiS2SS22«« 

TCG CCA GCC GAC ACT OTG GOT ATT CTC GAP r»r 585 S94 

Ser Pro Ala A*p Thr Val c£ £ to S£ S £ J* W « «C 
y «e A3 P His Trp Ph« Gly Val Aan Gly Lmi 

603 621 

==55S==S22== S 2=2«£ 

Figure 9a. 



WO 98/24799 



15/46 



PCT/US97/22623 



711 730 729 no 

ssscss^eeeeseeesse 

K EE E = = = E E =5 = 5 = - 5 E £ 

eeSeeeeseessseeeek 

,73 *M «S1 JDll 

COG OT8 TCT GAA MO CAA COC OCA ACT COT CTT CCC CTC CTC a2 ^ 818 

Arg Val sex Clu Olu 01, Arg Ala s« Gly " £ £ £ ™ « Z 

2" 2f 000 «« « ATC GTG rrA CAT « 

L«u His Tyr Tyr Pro Gly Al* Tyr Ken xi. Clu Acp II a v.l c£ "J 

■SEESSSESESSSEESSEE 
S EE 5 EE E EE E EE S EE S EE 

1089 1098 U07 in* ,,,«. 
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A,p Trp Leu Olu a. Tyr M.t Oly Pr. A.p Hi. My Val Thr Leu cry Leu Thr 

1143 1152 iisi 117 « 

?! ^ *" *" « ATQ « «C W TAT GCC TO 

OXu «« Cy. w Arg Xxn V al A« Pro M« Thr Thr Al. Zla Trp Tyr Ala Sar 

1197 120S "» 1234 1233 

ATB CTC GCC ACC TTC GCC GAT AAC COC CTC CAA ATA TTC ACC CCA TCC TOC^O 
Met L~ Gly Thr Ph. Al. A.p A.n Gly val Olu mmihrSSSS 

13,0 "«» "71 12S7 

AAC ACC OCA ATG TOO GAA ACA CTC CAC CTC TTC ACC CCC TAC AAC AAA f^Hi 
A» Thr Oly MeC Trp «» Thr U» U. Uu Ph. S.r Arl t£ a£ £ « ™ 

1305 "" "23 1332 1341 ,« 0 

CCGCTCKCTCCAGCTCCJOTCTr^GAOmWACCCCCTACACC TCC ATT 
Arg V.1 Al. Mr Sar Ser S.r Lm Ql» olu Phe V.l Ser Ala Tyr Sar Hr 5" 

135 ' 1368 "77 UBS 1J95 , . nJ 

2£ ^ ff * " W *°» CT * "» «« ■» »» CGT TCC ACT ACC GAC 

*»» Clu Al. Clu Asp Al. Mac Thr Val Lau Lsu Val Ant Arg Sor Thr Ser Glu 

Figure Sb(Continued) 



WO 98/24799 



16/46 



PCT/US97/22623 



B«*i. gouldl •D4ogluoa»«. (370P1) (continued) 

S ES E S = ™ - = ZS = 55 

14 *' 147S 14J5 u«4 

ACCCTCCCCTIACACAACCTOCCOCGCGAG CAA ACC Xlc OTA 22 r-,, ~ 

Ar, ^ „ ia to ^ Pro 01y 01u G ^ £ 

^ S 2 £ £ ™i « « ACA 

*u m oiy TUT. V«U Arg Ala Sex Asp Asa Thr Val Thx Leu Glu 

^ 1575 1584 1593 1602 leu 

™ •« «T CTG TCC GTT ACT GCA ATA TTG CTC AAG CCC CGG CCC TAX 3- 
I*u Pro Pro Leu Ser Val Thr Ala II. Le U Leu Ly, Ala Arg * " 



Figure 94 (Continued) 
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7f*»fni4cwi nasi Lima A3phn-q rt | at .L M i.u nw 
Ctmvl«L« Gene Scqiirnrn 5^ 

val xi. cy« val clu n. P)"« cly £ ^ ^ ^ ™ ~ ™ ™ ~ — 

Clu l*a A«„ Ph. Jh, « al " «u Pbi M. vil eta Zy." ^ tei 1^ »y i£ 

126 135 m ].« 

Lys He Ser Gly Aro Val Lyi Gl"y Ser Gly Ary Leu Glu Val Leu Ary Thx 

^Ti* f3T **** TOt * ^ s 

Lys Ala Pro Clu Lys Val Leu Vol Asn Asn Trp Gin Ser Trp Gly Pro Cys Ary 
~ 2? 5 234 243 2 52 251 27Q 

Val ValAspAU Phe Ser Phe Lys Pru Clu lie ^ to ^ iy^ 
279 288 257 306 315 

Thr Ala Ser Val Val Pro Asp Val Leu Clu Arg Aon Leu Gin Sex Asp Tyr Phe 
33 3 342 351 360 369 ' 37S 

Val Ala Glu Clu Gly Lys Val Tyr Gly Pha Leu Ser Ser Lyo iH Ala His Pro 

_ 387 396 405 414 423 432 

TO TTC OCT GTG CAA GAT CGC GAA CTT GTG CCA TAC CTC GAA TAT TTC CAT GTC 

Phe Phe Ala Val Clu Asp Cly Clu Leu Val Ala Tyr Leu Clu Tyr Pha Asp Val 

441 450 <S3 468 477 455 

GAG TTC GAC GAC TTT CTT CCT CTT GAA CCT CTC CTT OTA CTC GAG CAT CCC AAC 

Glu Phe A-ip Asp Phe Val Pro Leu Giu Pro Leu Val Val Leu Glu Asp Pro Act 
49i W>4 513 522 531 540 

^5!! ***** *^ ccc 

Thr Pro Leu I-eu Leu Glu Lys Tyr Alj Clu Leu Vol Cly Met Clu Asn Asn All 

^ 358 567 57S 585 594 

ACA GTT CCA AAA CAC ACA CCC ACT CCA TCC TCC ACC TCC TAC CAT TAC TTC CTT 

Art) Val Pro Lyu Uiu T)ir i»io -nit Gly Trp eye Ser Tip Tyr Hi* Tyr Hie Leu 

Figure lOcu 
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ViirarAr3 ely ^ phe p " v*i «„ «J » ; *L" t ~ ^ i~ — — 

2f 2! ™ « « « ™ « S ccc « S act err JS caa ac= &° 
*- «r «» ii. ,*„ cly n. ^ ^ pro" ihiSSSSfc 

Val Phe Asn Clu His Pro Asp Trp Val Lys «J ^ c ly ^ ~ ~ 
Ala Tyr Arv Aan Trp A*n Lyt Ly» Ila Tyx Al» L«u Aip Leu s«x Lyo Asp 

*rg Tyr Phe Ly, il. xsp Pho L«J All ciy d Pr= sly ^ ^ ^ 

^ ^ ~ « ™ - - « ATT CAc'S ATC AGA^AAA 

Ll, ° To Xl« Gin All Pte A^ [yi ciy ill Olu Thr II. Arg ly~ a 

OCC « ^ TTC ATC^CTC CCA TGC^CCC «, CCC^ CTT CCC^ 



Ma vol ci y clu ^ ^ Hie u ; ™ "J" ™ *- y — ™ — ™ 

r-m ^ k U4J Ui2 1170 1179 11 BR 

™ ^ ™ ^ « A ^ « ™ ™ CCT CAC ACT CCC TIC TOG OCA 

V*l Gly Cy S Vnl /^p cly ^ ^ ]" Q ^ £ ^ ^ ™ ™ ™ 
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tlftn.tiotjn maritime Alpha I Actus iUazic 
Ctmvlwlu Cune Seqiif-ncft j v ( 

U97 1206 121b 1224 12.U L242 

™ f^I ™ GAC AAC GC1A OCT CCC CCT CCA ACA 7K3C GOG CTG AGA AAC CCC 
Clu His He Clu Aap Asn Ciy Ala Pro Ai* Ala Arp Trp Ala L«u Arg Asa AJo 

1251 1260 1269 1278 12B7 m B( 

-^^^^^^^f^^^^CTCAAC GAC CCC ©c TOT^ 
lie Thr Arg Tyr Pho Hmz. Hie Arp Arg Phc Trp Leu A*n Arp Pro Asp Cya Leu 

5!? !?? ACC GATOT ACA CAG^AAG CAA AAG^GAG CTC TAC^ 

Tie Leu Arg Glu Clu Lys Thr Asp Leu Thr Gin Lys Clu Ly^ Gl^ L^u" Tyr ^ 

~ ™™ ^ AGC CAT^GAT CTC TCGOT 



Tyr Thr Cy% Cly Val Leu Asp Aan 



Mac He He Clu Sex Asp Asp Leu Sex Leu 



v«l Arg Asp His Cly Ly» Lys Vol Leu Lyl Clu Thr Leu Clu Leu Leu ciy Cly 

^ ^ f?T < ^t*** ™ ato1 ^g cac ca/ctg aga ta^gag atc gicicg 

Arg Pro Arg Val Gin Aan He Mat Sex Glu Asp Leu Arg Tyr Glu lie Vol Ser 

1521 1530 1539 IMS 15S7 T«:«ir 

!^™^^^«^GTCAAC*«GWC1CCKP CTC AAC AGC AGA GAG 
Ser Gly Tfcr Leu Sex Cly Asn Val Lys. He Val Val ;. r p ~. ~. T~r clu 

— ' 1375 1584 1593 1602 1611 i«n 

^^f!!™^™f* WTOTO f^^^^OICGTC AAA AGA 
Tyr Hie Leu Glu Lys clu Cly Lys Sex Sex Leu Ly^ Ly^ Ar^ vil Val Lys" Ar^ 

1629 1638 1656 lfifiS 

GAA GAC CGA AGA AAC TTC TAG TTC TAC CAA CAC CCT GAG AGA GAA TGA 3 ' 

Clu Asp Gly Arg Asn Phc Tyr Phe Tyr Clu Clu Cly Clu Arg Glu 



Figure 10c ( continued) 
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Thermotoga urltiu R-uBataaa* {&0riO9 

9 13 27 36 45 54 

5* ATG GGG ATT GGT GGC CAC GAC TCC TGC AGC CCG TCA CTA TCG CCG GAA TTC CTT 

Met Gly He Cly Gly Asp Asp Ser Trp S«r Pro Ser Val Ser Ala Clu Phe Leu 

63 72 81 90 99 108 

TTA TTC ATC CTT GAG CTC TCT TTC GTT CTC TTT CCA ACT GAC GAG TTC GTG AAA 

Leu Leu He Val Glu Leu Ser Phe Val Leu Phe Ala Ser Asp Glu Phe Val Lys 

117 126 135 144 153 162 

GTG GAA AAC GGA AAA TTC GCT CTG AAC GGA AAA GAA TTC AGA TTC ATT GGA AGC 

Val Glu Asa Gly Lys Phe Ala Leu Asn Gly Lys Glu Phe Arg Phe He Gly Ser 

171 1B0 139 198 207 216 

AAC AAC TAC TAC ATG CAC TAC AAG AGC AAC GGA ATG ATA GAC ACT CTT CTG GAG 

.Asn Asn Tyr Tyr Hec His Tyr Lys Ser Asn Gly Mac He Asp Ser Val Leu Glu 

225 234 243 252 261 270 

AGT GCC AGA GAC ATG GGT ATA AAG CTC CTC AGA ATC TGG GGT TTC CTC CAC GGG 

Ser Ala Arg Asp Met Gly He Lys Val Leu Arg He. Trp Gly Phe Leu Asp Gly 

279 288 257 306 315 324 

GAG AGT TAC TGC AGA GAC AAG AAC ACC TAC ATG CAT CCT GAG CCC GGT GTT TTC 

Glu Ser Tyr Cys Arg Asp Lys Asn Thr Tyr MeC His Pro Glu Pro Gly Val Phe 

333 342 351 360 369 37B 

GGG CTG CCA GAA GGA ATA TCG AAC GCC CAG AGC GGT TTC GAA AGA CTC GAC TAC 

Gly Val Pro Glu Gly He Ser Asn Ale Gin Ser Gly Phe Glu Arg Leu Asp Tyr 

3B7 396 405 414 423 432 

ACA GTT GCG AAA GCG AAA GAA CTC GGT ATA AAA CTT GTC ATT GTT CTT GTG AAC 

Thr Val Ala Lys Ala Lys Glu Leu Gly He Lys Leu Vol He Val Leu Val Asn 

441 450 459 468 477 486 

AAC TGG GAC CAC TTC GGT GGA ATG AAC CAC TAC GTG AGG TGG TTT GGA GGA ACC 

Asn Trp Asp Asp Phe Gly Gly Hec Aan Gin Tyr Val Arg Trp Phe Gly Gly Thr 

495 504 513 522 531 540 

CAT CAC GAC GAT TTC TAC ACA GAT GAC AAG ATC AAA GAA GAG TAC AAA AAG TAC 

His His Asp Asp Phe Tyr Arg Asp Clu Lye He Lya Glu Glu Tyr Lys Lys Tyr 

Figure 11c*- 
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Theraotoga uritlu P-HMutit rn«ni| (continued) (6 -|*U' 



549 S58 567 576 585 594 

GTC TCC TTT CTC CTA AAC CAT CTC AAT ACC TAC ACG GGA GTT CCT TAC AGG GAA 

Val Ser Phe Leu Val Asn Hia Val Asn Thr Tyr Thr Gly Vel Pro Tyr Arg Glu 

603 612 621 630 639 648 

GAG CCC ACC ATC ATG CCC TGG GAG CTT GCA AAC GAA CCG CCC TGT GAG ACG GAC 

Glu Pro Thr lie Met Ala Trp Glu Leu Ala Asn Glu Pro Arg Cys Glu Thr A^p 

S57 666 675 6B4 693 7 02 

AAA TCG GGG AAC ACG CTC CTT GAG TGG GTG AAG GAG ATG ACC TCC TAC ATA AAG 

Lys Ser Gly Asn Thr Leu Val Glu Trp Val Ly» Glu Met Ser Ser Tyr He Lys 

711 720 729 738 747 756 

ACT CTC GAT CCC AAC CAC CTC GTG CCT GTG GGG GAC GAA GGA TTC TTC AGC AAC 

Ser Leu Asp Pro Aan Hia Leu Val Ale Val Gly Asp Glu Gly Phe Phe Ser Asa 

765 774 7B3 792 801 810 

TAC GAA GGA TTC AAA CCT TAC GGT GGA GAA GCC GAG TGG CCC TAC AAC GGC TGG 

Tyr Glu Gly Phe Lym Pro Tyr Gly Gly Glu Ala Glu Trp Ala Tyr Aan Gly Trp 

8" 828 B37 846 855 864 

TCC GGT GTT GAC TGG AAG AAG CTC CTT TCG ATA GAG ACG GTG GAC TTC GGC ACG 

Ser Gly Val Asp Trp Lya Lys Leu Lou Sar He Glu Thr Val Asp Phe Gly Thr 

873 882 891 900 909 918 

TTC CAC CTC TAT CCG TCC CAC TGG GGT CTC ACT CCA GAG AAC TAT GCC CAG TGG 

Phe His Leu Tyr Pro Ser His Trp Gly Val Ser Pro Glu Asn Tyr Ala Gin Trp 

927 936 945 954 963 972 

GGA GCG AAG TGG ATA GAA GAC CAC ATA AAG ATC GCA AAA GAG ATC GGA AAA CCC 

Gly Ala Lys Trp He Glu Asp His He Lys Ho Ala Lys Glu He Gly Lys Pro 

981 990 999 1008 1017 1026 

GTT GTT CTG GAA GAA TAT GCA ATT CCA AAG ACT GCG CCA GTT AAC AGA ACG GCC 

Val Val Leu Glu Glu Tyr Gly He Pro Lys Ser Ala Pro Val Asn Arg Thr Ala 

103S 1044 1053 1062 1071 1080 

ATC TAC AGA CTC TGG AAC GAT CTG CTC TAC CAT CTC GGT GGA GAT GGA GCG ATC 

He Tyr Arg Leu Trp Asn Asp Leu Val Tyr Asp Leu Gly Gly Asp Gly Ala Mat 

Figure lib (Continued) 
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!5 7 S 1584 i5Qi 
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Figure UO( Continued) 
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1629 1638 1647 1656 1665 

™ ™ ™ !?! ™ ™ ^ *" «» « «« «G CTG lie CTG AAA CTC 

He Glu Trp Asn Gly Glu V.l Cly Asa Gly mI L^u" cin L^u Zl vVl 'Cyl Leu 

1683 1692 1701 1710 1719 

™ ^ ^ ^ TGG CAA GAA CTO AGA GTA GCA AGG AAG TTC CAA AG A CTC 

Pro Gly Lys Ser Asp Trp Glu Glu vll Z'g vll HI Z~ 9 lyl V h l Glu Ar* ^eu 

1737 1746 "55 1764 1771 

!^ ™ *?! !!f ™ I AC ATC TAC ATT CCA « gga ™ 

Ser Glu Cys Glu lie Leu Glu Tyr L"p i!« ryr III ten ^ Glu G^y iZ 
1791 1800 1809 1818 1827 , 0 « 

~ ™ !!! " I AC 000 m 00 ^ 000 000 «« « ata £2 

Ly * Gly **» Leu Pro TV* Ai* V*l Leu Asn Pro Gly T^p" Val Lys He" cly 

184S 1B54 1863 1872 1881 1890 

^ ^ -I! ^ ^ ^ ^ GAA ACT GCG GAO ATC ATC ACT TTC GGC GGA 

Leu Asp Met Asa Asa Ala Asn Val Glu Ser Ala Glu He He Tto Pfae Gly Gly 

1899 1908 «" 1326 1935 1944 

^ ^ ^ ^ ™ ^ CTA ATT M TO « AGA ACA GCG GGG GTC 
Lys Glu Tyx Arg Arg Ph. His Val Arg lie Glu Ph. A^p A^g T^r Ala Gly v.l 

1953 1962 «*1 13B0 1989 199B 

^ Ti * ***** CTT GTC GGT CAT CAT CTG AGG TAC GAT GGA CCG ATT 

Ly. Glu L.u His II. Cly Val Vsl lly A^p H~is In Arg iyl A^p Gly Pro lie 

_ 2007 20" 2025 2034 2043 

TTC ATC GAT AAT GTG AGA CTT TAT AAA AGA ACA GGA GGT ATG TGA 3 ' 

Phe II. Asp Asn Val Arg L.u Tyr Lye Arg Thr Gly Gly Met 



Figure lid (Continued) 
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5 ' ATC CTA CCA GAA r\r* Jtl 27 36 

- - ~ - = = = ;;; s „•; - 

63 ?2 ' rne Gln Phe Glu 

- «y - u. ^ - - - - ~ ~- !? « - » 

117 u , "* **" «*» »TP 

«- - - - ^ »: = - - ~ - = = = 

342 ^ 

— = — = 

p val Lys He Asp Lya Smx- th- r 

^ Al* Glu ^ ^ ^ ^ ~ ~ 

«« oiu v.! ~ - "! ™ ^ ?* e « « ™ 

«i r "-™«— -« = = £ = 

«*- v.i P h . v.i n: - «; - - --- ~ ?if ^ ? c « 

^ - 

ne v.i rr : - ~~ TC 000 1,30 °« to ca G 

P *** 110 G1 * Trp Val ser oin 
Figure 12flL 
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AJEFXX la p-mannosidas* <»0B1) (continued) 

549 . 558 567 576 585 594 

AGO ACA GTT CTT GAG TTT CCC AAC TAT GCT GCT TAC ATC GCC CAT GCG CTC GGA 

Arg Thr Val Val Glu Phe Ala Lys Tyr Ala Ala Tyr lie Ala His Ala Leu Gly 

603 612 621 630 639 648 

GAC CTC GTG GAC ACA TGC AGC ACC TTC AAC GAA CCT ATG GTA GTT GTG GAG CTC 

Asp Leu Val Asp Thr Trp Sar Thr Phe Asn Glu Pro Met Val Vol Val Glu Leu 

657 666 675 6B4 693 702 

CGC TAC CTC GCC CCC TAC TCA CCA TTT CCC CCG GGA GTC ATG AAC CCC GAG GCC 

Gly Tyr Leu Ala Pro Tyr Ser Gly Phe Pro Pro Gly Val Met Asn Pro Glu Ala 

711 720 729 738 .747 756 

GCG AAG CTG GCG ATC CTC AAC ATG ATA AAC CCC CAC GCC TTG GCA TAT AAG ATG 

Ala Lys Leu Ala He Leu Asn Met He Asn Ala His Ala Leu Ala Tyr Lya Met 

765 774 783 792 801 810 

ATA AAG AGG TTC GAC ACC AAG AAG GCC CAT GAG GAT AGC AAG TCC CCT GCG GAC 

He Lya Arg Phe Asp Thr Lys Lya Ala Asp Glu Asp Ser Lya Ser Pro Ala Asp 

819 828 837 846 855 864 

GTT GGC ATA ATT TAC AAC AAC ATC CCT CTT GCC TAC CCT AAA GAC CCT AAC GAT 

Val Gly Il« He Tyr Aan Asn He Gly Val Ala Tyr Pro Lya Asp Pro Asn Asp 

873 882 891 900 909 918 

CCC AAG GAC GTT AAA GCA GCC GAA AAC GAC AAC TAC TTC CAC ACC GGA CTG TTC 

Pro Lys Asp Val Lys Ala Ala Glu Asn Asp Asn Tyr Phe His Ser Gly Leu Phe 

927 936 945 954 963 972 

TTT GAT GCC ATC CAC AAG GGT AAG CTC AAC ATA GAG TTC CAC GGC GAA AAC TTT 

Phe Asp Ala He His Lys Gly Lys Leu Asn He Glu Phe Asp Gly Glu Asn Phe 

981 990 999 1008 1017 1026 

GTA AAA GTT AGA CAC CTA AAA GGC AAT CAC TGG ATA GGC CTC AAC TAC TAC ACC 

Val Lys Val Arg His Leu Lys Gly Asn Asp Trp He Gly Leu Aan Tyr Tyr Thr 

1035 1044 1053 1062 1071 1080 

CGC GAG GTT GTT AGA TAT TCG GAG CCC AAG TTC CCA ACT ATA CCC CTC ATA TCC 

Arg Glu Val Val Arg Tyr Ser Glu Pro Lys Phe Pro Ser He Pro Leu He Ser 

Figure 12b(Continued) 



WO 98/24799 



26/46 



PCT/US97/22623 
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Figure 12C(Continued) 
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0C1/4Y todofflocaam*. {3J0P1 



ATC GTA GAA AGA CAC TTC AGA TAT c£ CTT ATT W *~ " _ . 54 



Mec 



A ™ TCc acc era ttt ctt ctt atc 

Val Glu Axy Hia Ph« Ara Tvr v*l r.*,. tI"! ZT" "~ Z?~ 



Axs Hi a Ph« Arg Tyr Val Leu lie Cy 3 Thr Leu Phe 
" 72 M 90 „ 



Leu Val Met 



CTC CTA ATC TCA TCC ACT CAC CCA AAA AAT GAA CCA AAC AAA AGA GTG ^ 

^ L,u ne ser s " "» Gin ^ £ £ £ ^ ~ --- - 

117 1JS 135 14* 

Ser » Glu Gl„ se r v«l Ala III S Z ^ III Asn £ aII ^ ~ ~ ™ 

171 18( > 189 iQft 

~ -™ - ^ ™ ™ A " « «« tta « cc, TC ™ 

Ly* «ec Vli dy Ly, Cly vll 111 7ll aly ^ ^ ^ ~ ~ ~ ~ 
225 334 3« 2S2 5r, 

~ ~ ™ ™ ^ *™ ^ ™ ~ ~ «* ««• "a mo 

Gly U. Trp Cly Val Ara II. Glu I*p «u £ phi gIu nil J£ HI lyl Arg 
379 288 391 306 i,« 

~ ™ *! !?!?!! t!? ™ ™. « ** « ^ 

Gly Ph. A.p ser v.l Ar a n. HI ^ ™ ™ ~ ~ ~ ~ ~ ~ 

Pro Pro Tyr A.p u. A»p Arg A,„ Uu Mu Ar^ vll ill ^ ^ ~ 

387 396 405 411 

*GC GCT CTT GAG AAT AAT TTA ACA OTA ATC ATC AAT ACQ CAC CAT TTT GAA GAA 

Ar, Al. L.u Olu A,n A3„ l^u Thr ' V H III III Zl ill III v. r H ~ ~ 

441 450 459 4 6 g 

™ I-! ™ ™ ™ "I ^ I*! « « T m m ™ *« TO AGA CAO 

L.u Tyr Cl„ Clu Pro A,p Ly S Tyr Gl"y Asp m HI £1 ~ ~ ™ ™ 

495 S0 « 513 522 

-I! !!!!!! ^ ™ ™ 005 «* *" ™ tTC TTT GAA ATC TAC AAC 

». Al. Ly. Ph. Ph. Ly. A S p Tyr £ ^ ~ ™ ~ ™ ~ ™ — 

Figure 13CU 
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ox- - ein ^ ^ Thr A1 . ~ ~; ~ - --- ... ... ... 

€03 g 13 , 

S« A. n P„ Thr Ar B II. Val Ile Ajp ^ ^ 

657 «66 fi7e 

*P Hi, ryr Ser Ala v al ^ ~ ~ ~ ~ ~ - --- -_ 

711 720 725 

n. a. «i s« , h . Hi. ^ ^ ~ ~ ~ ;« - ~ --- _ 

765 774 783 7<n 

«- *p « xx. ^ " ~ ~ - - - - --- --. .„ 

815 828 831 

«- n. «. n. a,, s . t Hi. £ £ ~ ~ ~ ~ - - ~- --- 



918 
ATG 



*.= *.» v.x a . Ph . ^ G " ~; ~ - jua ~ ~ --- 

S « A,, « .^T. „ u S „ ~ ~ ~ - - - ~ --- 



1026 



Ser ^ Ai. ^ ^ Glu ph . ^ ~ ~ ~ - --- _ „ 

1035 1051 

« n to Trp u . Glu ^ ^ Wa ^ ~ ~ --- 

TAA J' 



figure 13t (Continued) 
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otog. maritim* »ullnl»n*.. < 6 QP3 



27 3S <5 54 



9 - 18 

S> -If ™ ?H K fS_ ^! ^ ATA CTC AGG ^ AAC GAG CAG CCA AAA 

Met Asp Leu Thr Ly. Val cly III HI Z £ ill Z'n Gl'u V^> HI Zyl 

**** GCT GAA GTX2 TCG 

Asp Val Ala Ly. Asp Ar ff Pha ~ ™ ~ ™ ™ ~ ~ ~ ~ ~ ™ 

117 136 135 144 

-I* f !! ™ ™ ™ ~ ™ « TC ™= «* «»* «A GAC iS TCT CCC a£ 

n. l.u «. ciy v«i «; «; n~ ^ ;i; i~ ~ ~ — 

171 180 !B9 198 207 

II. Phe Ph. Al. Gin Ala Ar B ^ ^ ^ "I n. «„ III ^ ^ £ ~ 

225 334 J« 252 5 si 

!!I !!!!!! *™ ^ ^ ^ « ™ =IT ACT CTP «C GGA AAA 
Pro V«l Asp Thr Ly. Ly. Ly. Glu Leu Pha Ly» III T^r vll tap My Lys clu 

"* 288 "7 30S 3is 

He Pro V.1 Ser Axg V.1 Glu iyl Al. A^ p^ ^ Zl III Asp v»l Thr Asn 
J3i 342 3 '1 3(0 ico ,,. 

Tyr V.l Ax, U. v.1 L.u s.r Glu III Leu Zy. clu Glu Zl Ly," 

387 3S« 40S 4H 

™ ~ ~ -If -I- ™ !!! *** «* 010 ATC ATO GAG ATC 

V.1 Glu Leu II. II. Gl„ Gly Tyl ly. £ ^ ^ ™ ~. ~ ~ ~ ~ 

4*1 450 as 468 ... 

™ ~f ~ ™ I*! ™ ™! °» a 00 °» « GTA TAT TCT CCA GAG AAC 

Leu Asp Asp Tyr Tyr Tyr A»p Gly Glu Uu Gly III V«7 ^ Zl III «u lyi 

495 504 5 " 522 511 KJ „ 

Thr lie Ph. Axg V.l Trp S„ P ro W S.r Ly. Trp vll Ly. vll Leu Uu Phe 

Figure 14*— 
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^^^^^^^^^ 57, 5 a 5 

MC A TG GAA TAC AAG GGA 

«*- ^ sly oiu ^ ^ Glu Pro - ~; -~ --- ~- -- « 

vai Asn Met Glu Tyr Lys Civ 

603 gjj 

- «. ... „ „. „. „ „„ - ~ = = _ ... 

"*.-"'— *-■»-"- - « = = 5 S~;:;=~ 

711 720 750 

- v., 01n Glu ^ -- - - - - _ _ ___ 

765 774 Tot 

- «- «. - ^ - - - - ~- - 

819 828 an 

». n. ^ «. Ita „. ai» ;;; = - ~ - - - » ... 

- ,y. Hy _ ^ ^ 01y ^ = ~ - - - --- ... 

937 336 945 

«y « - ~ « y u. s., .I." £ « ~ ~ - - - ~ --- --- 

981 qqq 

». - *o , h . , he „ p Phe ^ - - ~ - ~ --- ... ... 

1035 1044 io^i 

. nTrp 01y ^ ^ ^ ™ ~ ~ - --- _ ... ... ... 

Figure 14b(Continued) 
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1089 1098 1107 

tac ™ «a, ccc^Mccaaccc aca^ aga ^ ^ ^ 
Tyr s„ a, p p„ Lys Pro H1J ~ ~; ~ : ;~ --- --- --- ... 

GTC AAA CCC CTT CAC^ CAC SCr"£ GOT ««« 

^' A ™ SCT CTC ATT ATG CAC ATG GTC TTC CCT 

V.! «0, Al. Le u ^ lya £ ~ ~ ~ ~ ~ --- --- --- ... 

1206 1213 

CAC ACC ^ CACACC^cc,^" 

Hi. Thx TVr Oly II. oiy Ola ^ S „ aL" P^ a^ £ £ "J £ ~ ~ 

!251 1260 12*0 «... 

Ph. lyr Ar, II. Asp Ly, Thr Gly Al"." Tyr £ ^ oL~ s « £ ~ ~ ™ 

1305 1323 n „ 

~ -!! ^ ~ ~ ^ « »° « ™ ™ ATA CTC'CAT ACC « ™ 
Val XI. Al. S.r Glu Ar g Pro « ^ ^ £ ~ ~ ~ ~ ~ ~ - 

1359 1368 l37? 

I- ™ T. ^ ™ ™ « « « «c ac4 m gat'cII a TO ggt 1 ^ 

lyr txp VI Ly. Cl» Tyr Hi. IU Lp ci~y ^ ^ ^ As'p" ^ £ ^ ~ 

1413 U22 143! Wift 

*™ ~ ^ ™ ™ *I! ™ ~ !* « err cat 1 ^ ATC CAx'cci 

ii. A. P tyi i. y . h« Bl v.: ;i: £ ^ a; a."; 

1*67 1476 was * AnA 

«»r ii. a . i*. rvr 01y „. Pro ^ £ ~; ~ --; - ~- -« ~- --- 

1521 1530 153a 

^A AAO A=C =AT CTC =CC OCC ACA CAC OT ^ TTC AAC^T CAO rrC^ 
«y Ly. S« A^ V.! Al. «y ^ £ ^ ~ ~ ~ ™ ~ ~ 

1575 " 1584 1593 1SQ , 

A3P A la X1 . ArS Oly S .r V.1 Ph. ^ ~ ~ ^ ~; ~ - ~- ™ 

Figure 14C( Continued) 
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«« « « « « ACC ^ ^MJ agg »» 1S74 

1692 

p ciy 1? : l ~ - - - ~ - - - - = = = = = £ 

« « ~ « OC « l S » «C™ AAC TAC^CTT GCC GCC^AAA 

M- Ai. cy. Hi. Asp A,„ His Thr £ ^ " ~ 7 ~ 
« I«u Tzp Aap J,y» Asn Tyr Lau Ala Ala Lys 

1791 IfiOO isna 

„„ „„ - = - ~ -.. ._ ... _. 

= ~ ~ ? ~ = ■» »» « «■£ m „™ 
" "• - - — - * = = = = = s = 

1899 1508 ion 

2f ------ XACAAC^ CCTAK: - 

- - A,, ^ A. n p..' ~ ~ ~ ~ ~ ~- - * S 

^ Asn Pro He st 

1953 X962 

;P Sf =f ™ "J ~ ~ « - « O.'S „ O^'S TO 

-«-——-. «; = = = = = s; - 

2061 2070 

GCT GAA GAG ATC AAA AAA CAC CTC cul m 2097 2106 

«. ... ,. ^ ~ = _ _ _ ... ... 

= = = ™ = - . ««« „ 

„. „. „, - - ~ ~. _ ._ _ ... 

Kvn Hd(Contin„ a) 
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Tb.»otoga -rit!« »ullul«„. , c<mtAxm#d) 

ATT TAC^T » ^£ CAC AAC^ ACA TAC^AAA CTC « M ™ 

He Tyr A, n Cly Aan Uu Ly* Thr Thr Tyr £ ^ ^ Gx"u «; ^ ^ 

2223 2232 2241 2 5 cn 

AAT «o CTT „g «c ACC CAC AAA CCC COX ACA GAA STC ATA GAA ACC G^ 
A» V.l <U Val S . r 01n Lys Cly ciC ^ Xl^ Gl^ £ J~ ~ 

2277 22B6 2295 2304 

~ ~ -I- ~ !!! ™ fff ™ ™ « ™ «™ CM TAC ^ GAG TCA 3 • 

ciy. n. gi» u» a«p Pr ; ^ Se ; ^ ™ ~ ™ ~ — - :: 



Figure i4e< Continued) 
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Figure 15 . »« B0CW ^ , 

(Clone # «»2> Glycosidase 

=========== ==r=== 

GTG AAA GTG GAA AAC GGA AAA TTC rrr ^ 

==========r == = === 

GGT ™ GAA AGA CTC GAC TAP Am „ w ~ 

- - - «. _ ... ™ - ~ r r m - 

y AJ - a ^ Glu Leu Gly H e 

" Z ~ZZZZZZZZZZZZZZ 
=======5=== === -- = 

y Tyr val Ser Phe Leu Val A* a His 

=====5====rr== === 
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CTC GW GAG TGG GTG AAG GAG ATG AGC TCC TAC ATA AAG ACT CTG GAT CCC 
Leu Val Glu Trp Val Lys Glu Met Ser Set Tyr lie Ly, Ser Leu Asp Pro 

AAC CAC CTC GTG OCT GTG GGG GAC GAA GGA TTC TTC AGC AAC TAC GAA GGA 
Asa His Leu Val Ala Val Gly Asp Glu Gly Phe Phe Ser Asn Tyr Glu Gly 

TTC AAA CCT TAC GGT GGA GAA GCC GAG TGG GCC TAC AAC GGC TGG TCC GGT 
Phe Lys Pro Tyr Gly Gly Glu Ala Glu Trp Ala Tyr Asn Gly Trp Ser Gly 

GTT GAC TGG AAG AAG CTC CTT TCG ATA GAG ACG GTG GAC TTC GGC ACG TTC 
Val Asp Trp lys Lys Leu Leu Ser lie Glu Thr Val Asp Phe Gly Thr Phe 

CAC CTC TAT CCG TCC CAC TGG GGT GTC AGT CCA GAG AAC TAT GCC CAG TGG 
His Leu Tyr Pro Ser His Trp Gly v.l Ser Pro Glu Asn Tyr Al. Gin Trp 

GGA GCQ AAG TGG ATA GAA GAC CAC ATA AAG ATC GCA AAA GAG ATC GGA AAA 
Gly Ala Lys Trp lie Glu Asp His He Lys lie Ala Lys Glu He Gly Lys 

CCC GTT GTT CTG GAA GAA TAT GGA ATT CCA AAG AGT GCG CCA GTT AAC AGA 
Pro val val Leu Glu Glu Tyr Gly lie Pro Lys ser Ala Pro Val Asn Arg 

ACG GCC ATC TAC AGA CTC TGG AAC GAT CTG GTC TAC GAT CTC GGT GGA GAT 
Thr Ala lie Tyr Arg Leu Trp Asn Asp Leu Val Tyr Asp Leu Gly Gly Asp 

GGA GCG ATG TTC TGG ATG CTC GCG GGA ATC GGG GAA GGT TCG GAC AGA GAC 
Gly Ala Met Phe Trp Met Leu Al. Gly lie Gly Glu Gly Ser Asp Arg Asp 

GAG AGA GGG TAC TAT CCG GAC TAC GAC GGT TTC AGA ATA GTG AAC GAC GAC 
Glu Arg Gly Tyr Tyr Pro Asp Tyr Asp Gly Phe Arg lie val Asn Asp Asp 

AGT CCA GAA GCG GAA CTG ATA AGA GAA TAC GCG AAG CTG TTC AAC ACA GGT 
Ser Pro Glu Ala Glu Leu lie Arg Glu Tyr Ala Lys Leu Phe Asn Thr Gly 

GAA GAC ATA AGA GAA GAC ACC TGC TCT TTC ATC CTT CCA AAA GAC GGC ATG 
Glu Asp n e Arg Glu Asp Thr Cys Ser Phe He Leu Pro Lys Asp Gly Met 

GAG ATC AAA AAG ACC GTG GAA GTG AGG OCT GGT GTT TTC GAC TAC AGC AAC 



Figure 15b (continued) 
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ACG TTT GAA AAG TTO TCT GTC AAA GTC GAA GAT CTG GTT TTT n.. 

- ^ Glu Lys Leu ser val LyB m ™ - - - « GAG 

«y Tyr Oly lie Tyr Gly Phe k3p Leu ftsp Thr ^ ^ 

ATC CCG GAT GGA GAA CAT GAA ATG TTC CTT GAA GGC c*r 

- - Asp Gly Glu His Glu Het , he Leu ~ - « » = - £ 

s s £ z z s r r w gtc gtc « - « « - « 

i*» asp ser ne lys Ala Lys val val RBa giu wa ^ ^ ^ 

CTC OCA GAG GAA GTT GAT TTT TCC TCT rr-» 

- - «. - v, ASP , he - « - = « - - ~ - 

a" c 0 % A : a : ™ ? r ™ - - » « « - ™ aac 

Oly Thr Trp Gin Ala Glu Phe Gly Ser Pro Asp lie Glu Trp Asn 

GOT GAG GTG GGA AAT GOA GCA CTG CAG CTG AAC GTG »»» n~ 
Gly Glu Val Glv A.„ m „ W GTG *** 010 CCC GGA AAG 

Gly Asa Gly Ala Leu Gin Leu Asn Val Lys Leu Pro Gly Lys 

GGA AGO TTO AGG CCG TAC GCG GTT CTG AAC CCC GGC TGG GTG AAG ATA GGC 
«T A, L e u Ar, Pro Tyr Ala val Leu Asn Pro Gly xrp Val £ £ Z 

CTC GAC ATG AAC AAC GCG AAC GTG GAA AGT GCG GAG ATC ATC ACT TTC GGC 

- Asp Met Asn Asn Ala Asn Val Glu Ser Ala Glu He Ue " Z " 

GGA AAA GAG TAC AGA AGA TTC CAT GTA AGA ATT car ^ 
Figure 15C(concinued) 
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GGG GTO AAA GAA CTT CAC ATA GGA GTT GTC GOT GAT CAT CTG AGG TAG GAT 
Gly Val Lys Glu Leu His lie- Gly Val Val Gly A 8 p His Leu Arg Tyr Asp 

GGA CCG ATT TTC ATC GAT AAT GTG AGA CTT TAT AAA AGA ACA GGA GOT ATG 
Gly Pro lie Phe U e Asp Asn Val Arg Leu Tyr Lys Arg Thr Gly Gly Met 

TGA 1991 
END 



Figure 1 5 d( continued) 
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Figure No. IfcThermocoga mariti^ «B. Ugb4) -. 

: ======5==r= ========= ... 

«» CAC CCG TAC CTT GGG ATG AAC GAA CAT CTC TTC „r r» 

« «. - ^ v al 01y Met A8n Clu Asp « ™ « « « «* «c aoa cao „ AIC 160 

/« MU Ila Glu Asp Ars oiu Tip u , so 

"1 TAC GAG AGO QAG TTC GAG TTC AAA GAA GAT GTG aaa r 

- =============;= ==5== ,; 

=-»-"-n:-:----»™ r « : ~ :: ... 

9 / Tyr lie Arg L y S Ala Gin Tyr Ser Tyr iso 
eiy i le Trp Lys PrQ yaI ^ ^ ^ ^ 

"1 GTA AAC GOT GAA AAC ATA GGG GAG TTT OCT GTT rrr 

- .*-^==:=™-- == - 

wp Tyr Pro Trp Asn Val ciy Lys Pro 260 
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900 
300 



9(0 
320 



761 TAC CTG TAC GAT TTC OTT TTC GTG TTG AAA OAC TTA AAC GGA GAG ATC TAC AGA «, -» 

- ryr u. ^ A . P P h , m , h . Vil Leu Lys „ Leu ^ oly Mu « « « « J* «. 

2 "l ^ r xf T "° ^ *" ^ =TT ^ « °» <^ ^ GGA AAA ACT 

"1 «*._«*. XI. Gly L.u Ar 9 Ar 9 v.l Ar 9 a. v a l Gl„ Glu p„ tap „„ „, 01y £ £ 

901 TTC ATA TTC GAA ATC AAC GGT GAG AAA GTC TTC GCT AAG GGT OCT AAC TGG ATT CCC «, 
301 Ph. XL Phe Glu m A.n oly «u Ly. V.1 ». Ly . oly w . £ £ « « ~ 

Ml GAA AAC ATC CTC ACG TGG TTG AAG GAG GAA GAT TAC GAA AAG CTC GTC AAA ATC GCA irr 

- «. A.» He Leu * X. Leu Lys G lu Glu ^ ^ 0 , tya J « ™ « - - 

m e ~ z « a" r t r 400 otc wg ■* sga oga aic tac «• « - « 

Asn Met AS„ Met Leu Ar 9 V.l Tlp G1 y Gly Gly u. Tyr Glu Ar 9 Glu He Phe ,„ 

"« ^ C r "* " * ^ ^ MC AT ° CTC TC ° °» =« TTC ATG TAC GCG TGT CTT „„ 

3" Tyr AT, Leu Cy. A.p Glu Leu Gly „. Met ViI Trp 01 „ Asp phe ^ ^ ^ ^ ^ 

m Glu I!' r ^ OT CM ™ " C " C CCS - GAG GCA AGA AAG ATT !200 

3.1 <U« Tyr Pro Asp „ ls ^ p „ Trp ph . ^ Ly , Ma ua ^ ^ 

♦01 v.1 Ar 9 Ly. Leu Arg Tyr Hi. Pro Ser a. v.l Leu Trp Cys Gly ^ Asn 5lu A5 „ ^ 



420 



Trp Gly Phe A8p Glu Irp Gly ^ Mec u> Arg Lya ^ ne ^ ^ ^ o 

"« ^ ^ ^ T T 0A ° A " *» MC CC ° TCC « « TAT !3S0 
Ar9 L.u Tyr L.u Phe A.p Ph. Pro 01u Ile ^ AU Glu ^ ^ ^ 

"« ^ To T T ^ T " " 0(51 GAA ^ " C " C ^ - CAC AGG CAC 1,.0 

Trp Pro ser S« Pro Tyr Gly G!y Glu Ly. Ala Asn Ser G lu Ly. Glu Gly Asp Arg Hi. <80 

Trp Tyr Val Trp Ser G!y Trp Met Asa Tyr «u Asn Tyr Glu Ly. A.p Thr Gly Ar, 500 

"01 TTC ATC AGC GAG TTT GGA TTT CAG GGT GCT CCC CAT CCA r»f „, 

501 Dh. ,1. ^, „ t 1:01 M ACG AT * WO TTC TTT TCA 1SS0 

He ser Glu Phe Gly Ph. Gin Gly Al. Pro His Pro Glu Thr He Glu Phe Phe Ser 520 

1561 AAA CCC GAG GAA AGA GAG ATA TTC Cat rrr r-rr 

521 Lya Pro ei u . T ^ ^ AAC AAA CAG GTG GAA X620 

t*. Pro Glu Glu Ar 9 G!u Ile P he His Pro Val „ et Leu Lys His Asn Lys Qln ^ ^ ^ 

Figure 16b(continued) 
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Lys Cys Lys Asp Phe Asp 



1660 
550 



- ==========rrr=s =s=== r 

■= ======= r=zr== == = ===s r 
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■= ==-==5=====xr== ====r - 

AAa L y a Ser Leu 6$o 

•» *-:;:=s™ === - .... 

2041 TGT GAG TTT OCT tqa 2o55 
6 «1 Cy 8 Glu Phe 01y ^ 6fl5 



1920 
£4 0 



Figure 16c<continued) 
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Figure No. lH-Bankia gouldi (37gp4) 

1 ATG AAA AAA AAT CTA CTA ATO TTT AAA AOG CTT ACG TAT CTA CCT TTO TTT TTA ATG CTG 
1 Met Lys Lys Asn Leu Leu Met Phe Lys Arg Leu Thr Tyr Leu Pro Leu Phe Leu Met Leu 

61 CTC TCA CTA ACT TCA GTA OCT CAA TCT CCT GTA GAA AAA CAT GGC CGT TTA CAA GTT GAC 
21 Leu Ser Leu Ser Ser Val Ala Gin Ser Pro Val Glu Lys His Gly Arg Leu Gin Val Asp 

121 GGA AAC CGC ATT CTT AAT GCG TCT GGA GAA ATT ACG AGC TTA GCT GGT AAC AGC CTC TTT 
41 Gly Asn Arg Ue Leu Asn Ala Ser Gly Glu lie Thr Ser Leu Ala Gly Asn Ser Leu Phe 

181 TGG ACT AAT GCT GGA GAC ACC TCC GAT TTT TAT AAT GCA GAA ACT GTT GAT TTT TTA GCA 
61 Trp Ser Asn Ala Gly Asp Thr Ser Aap Phe Tyr Asn Ala Glu Thr Val Asp Phe Leu Ala 

241 GAA AAC TGG AAT AGC TCA CTT ATT AGA ATA GCT ATG GGC GTA AAA GAA AAT TGG GAT GGC 
81 Glu Asn Trp Asn Ser Ser Leu He Arg lie Ala Met Gly Val Lys Glu Asn Trp Asp Gly 

301 GGA AAT GGC TAT ATT GAT ACT CCG CAG GAG CAA GAA GCT AAA ATT AGA AAA GTT ATT GAT 
101 Gly Asn Gly Tyr lie Asp Ser Pro Gin Glu Gin Glu Ala Lys lie Arg Lys Val He Asp 

361 GCA GCT ATT GCT AAC GGC ATA TAT GTA ATA ATA GAC TGG CAC ACT CAC GAA GCA GAG TTA 
121 Ala Ala lie Ala Asn Gly lie Tyr Val lie lie Asp Trp His Thr His Glu Ala Glu Leu 



60 
20 



120 
40 



180 
60 



24 0 
80 



300 
100 



360 
120 



420 
140 



421 TAC ACA GAT GAG GCT GTT GAC TTT TTT ACC AGA ATG GCA GAC CTA TAC GGA GAT ACT CCC 4B0 
141 Tyr Thr Asp Glu Ala Val Asp Phe Phe Thr Arg Met Ala Asp Leu Tyr Gly Asp Thr Pro 



160 



481 AAT GTA ATG TAT GAA ATT TAT AAC GAG CCT ATA TAC CAA ACT TGG CCT GTT ATT AAG AAT S40 

161 Asn Val Met Tyr Glu He Tyr Asn Glu Pro He Tyr Gin Ser Trp Pro Val He Lys Asn 1B0 

S41 TAT GCA GAG CAA GTA ATT GCT GGT ATA CGT TCT AAA GAC CCA GAT AAT TTA ATA ATT GTA 600 

181 Tyr Ala Glu Gin Val lie Ala Gly He Arg Ser Lys Asp Pro Asp Asn Leu He He Val 200 

601 GGT ACT AGC AAT TAT TCT CAG CAA GTT GAT GTA GCA TCA GCA CAC CCA ATA TCT GAT ACT 660 

201 Gly Thr Ser Asn Tyr Ser Gin Gin Val Asp Val Ala Ser Ala Asp Pro He Ser Asp Thr 220 



720 



661 AAT GTG GCA TAT ACT TTA CAT TTT TAT GCA GCA TTT AAC CCG CAT GAT AAC TTA AGA AAT 

221 Asn Val Ala Tyr Thr Leu His Phe Tyr Ala Ala Phe Asn Pro His Asp Asn Leu Arg Asn 240 

721 GTA GCA CAG ACA GCA TTA GAT AAT AAT GTT GCT TTG TTT GTT ACA GAA TGG GGT ACA ATT 780 

241 val Ala Gin Thr Ala Leu Asp Asn Asn Val Ala Leu Phe Val Thr Glu Trp Gly Thr He 260 



WO 98/24799 PCT/US97/22623 

42/46 



» ===5=====Z=== =S===== - 
= ~====5r5==; ===== :== ... 

» ========s==; ====== -- .... ■ 

9 Ala Ala Met ciu Thr Ala Gin Ala 360 

1081 GGA GAT GAA ATT an ~ 

■= ========z=== 5 = == == == - 

" =============5= ==== ; - 

- ™ ~ ™ ~ ™ ~ z ~ ~ z ztzz z zzzzzz 

"21 TCT AAA GOT ATT CTT CTT GAC AAT TCT a*t ^ 

- - «, x, w leu A9P « « - « aT „ eo . 

ly Ser Lya Leu Lye Asn Lau val Val His 4S0 

- ==s====r=z ====== = === , :: 
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figure 17b (continued) 
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X .« r IT f ^ WC *" "* ^ m *» "» «* «C TCA GAT 1580 

Thr ». XX. Arg c,, Wal phe s „ au 0ly Ue Sm « 1«. 

«! Ml r r aim,MKl ^ ^ m ^ IAC ™ « ™* AAT GTT GAT 1,40 

5.1 Al. Ph. n. Asp Leu Lys Gly Ala Tyr Gly Ph. v.l Tyr Arg Asn Thr Ph. A .„ VaX Asp st0 

17.1 GGT TCT GAA GTA ATA AAT ACT GGA GTA GAC TTT TTA GAT AGA GGT ACA GGA TTT AAT ACA 18 0 0 

NX Gly S.r Glu Val XX. A.„ Thr Gly VaX Asp Ph. L.u Asp ^ Gly Thr Wy £ £ « »" 

1101 GGT TTT AGA AAT GCA ATA TTT GAA AAT ACA TAT AAC CTT GGC AGT AGA GCT TCA GAA ATT 1860 
<0X Gly Ph. Arg Asn Ala XI. Phe G Xu Asn Thr Tyr Asn L.u Gly s.r Arg Ala sTr 2 l" 



S20 



S.r Thr Ala Arg Ly. Ly, Gin Gly S.r Pro Glu Gin Thr Hi. Val Tr, Asp Asn II. Arg 640 

1M1 AAC CCT AAT TCT GTT GAT TTT CCA ATA AGT GAT GGT ACA GAA AAT CTA GTA AAT AAA TTC 1 SS 0 

641 Asn Pro Asn S.r Val Asp Ph. Pro IX. s.r Asp Gly Thr Glu Asn Leu Val Asn Lys Ph. 660 

1M1 TGC CCA GAT TGG AAT ATA GAA CCA TGT AAT CCT GTA GAC GAA ACC AAC CAA GCA CCT ACA 2040 

•M Cys Pro Asp Trp Asn IX. GXu Pro Cy. Asn Pro Val Asp Glu Thr Asn Gin Ala Pro Thr 680 

2041 ATA AGC TTC CTA TCT CCT GTT AAC AAT ATT ACT TTA GTT GAA GGT TAT AAT TTA CAA GTT 2100 

«31 II. s.r Ph. L.u s.r Pro Val Asn Asn II. Thr L.u Val Glu Gly Tyr Asn L.u Gin Val 700 

«0l GAA GTT AAT GCT ACT GAT GCA GAT GGA ACT ATT GAT AAT GTA AAA CTT TAT ATA GAT AAC 2160 

Glu val Asn Al. Thr Asp Ala Asp Gly Thr U. Asp Asn Val Ly. L.u Tyr II. Asp Asn 720 

2161 AAT TTA GTT AGG CAA ATA AAT TCT ACT TCA TAT AAA TGG GGC CAT TCT GAT TCT CCA AAT 2220 

721 A.n Leu Val Arg Gin II. Asn S.r Thr S.r Tyr ly. Trp Gly Hi, s.r Asp Ser Pro Asn 7*0 

2221 ACA GAT GAA CTT AAT GGT CTT ACA GAA GGA ACT TAT ACC TTA AAA GCA ATT GCA ACT GAT 22B0 

Thr Asp Glu Leu Asn Gly Leu Thr Glu Gly Thr Tyr Thr Leu Lys Al. XI. AXa Thr Asp 760 

»« AAC GAC GGG GCT TCT ACA GAA ACG CAA TTT ACG TTA ACT GTA ATA ACA GAA CAA AGT CCG 2340 

Asn Asp GXy Al. s.r Thr GXu Thr Gin Ph. Thr Lsu Thr Val He Thr Glu Gin s.r Pro 700 

"« T T * M ^ "* ^ ^ ^ T " ACT °° T ™ «T TTT GAC ATT AAA 2400 

S.r Glu Asn cys Asp Ph. A.n Thr Pro S.r S.r Thr Gly L.u Glu Asp Ph. Asp II. Ly. 800 

Ml AAG TTT TCT AAC GTT TTT GAG TTA GGA TCT GGC GGA CCA TCT TTA AGT AAT TTA AAA ACA 2460 

Figure 1 7fl,< continued) 
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■« Lys Phe Ser A«n Val p ha G i„ T 

Ph ° Glu *** u Cly Ser Cly oi v Pro e 

y GlY Pro Ser Leu Sflr Asft ^ 

=========r=r==rr=== == 

-=-=======:=r ======= 

::: == ============5=== 

m :: ^~ - - ~ z z z z z z r w m « - ~ 

LyS Thr Asn Asn Phe Thr He Tyr 

-============= ======r 

» ========r== ======== : 
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Figure 17aCcontinued) 
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Figure No. ia<v Pyrococcus furioeus VC1 (7EG1) 

leader sequence: amino acids 1-24 

9 18 27 36 4S 54 

5' ATG AGC AAG AAA AAG TTC GTC ATC GTA TCT ATC TTA ACA ATC CTT TTA GTA CAG 
Met Ser Lys Lys Lys Phe Val He Val Ser lie Leu Thr He Leu Leu Val Gin 

" 72 81 50 99 108 

GCA ATA TAT TTT GTA GAA AAG TAT CAT ACC TCT GAG GAC AAG TCA ACT TCA AAT 
Ala lie Tyr Phe Val Glu Lys Tyr His Thr Ser Glu Asp Lys Ser Thr Ser Asn 

117 126 "5 U4 xsa 1€2 

ACC TCA. TCT ACA CCA CCC CAA ACA ACA CTT TCC ACT ACC AAG GTT CTC AAG ATT 
Thr Ser Ser Thr Pro Pro Gin Thr Thr Leu Ser Thr Thr Lys Val Leu Lys He 

171 180 "a 198 207 216 

AGA TAG CCT GAT GAC GGT GAG TGG CCA GGA GCT CCT ATT GAT AAG GAT GGT GAT 
Arg Tyr Pro Asp Asp Gly Glu Trp Pro Gly Ala Pro He Asp Lys Asp Gly Asp 

22S 234 243 25 2 261 270 

GGG AAC CCA GAA TTC TAC ATT GAA ATA AAC CTA TGG AAC ATT CTT AAT GCT ACT 
Gly Asn Pro Glu Phe Tyr lie Glu He Asn Leu Trp Asn He Leu Asn Ala Thr 

279 288 29 ? 306 3is 324 

GGA TTT GCT GAG ATG ACQ TAC AAT TTA ACC AGC GGC GTC CTT CAC TAC GTC CAA 
Gly Phe Ala Glu Met Thr Tyr Asn Leu Thr Ser Gly Val Leu His Tyr Val Gin 

333 342 35 1 360 3€9 37e 

CAA CTT GAC AAC ATT GTC TTG AGG GAT AGA ACT AAT TGG GTG CAT GGA TAC CCC 
Gin Leu Asp Asn He Val Leu Arg Asp Arg Ser Asn Trp Val His Gly Tyr Pro 

367 396 4 °S 414 423 432 

GAA ATA TTC TAT GGA AAC AAG CCA TGG AAT GCA AAC TAC GCA ACT GAT GGC CCA 
Glu He Phe Tyr Gly Asn Lys Pro Trp Asn Ala Asn Tyr Ala Thr Asp Gly Pro 

441 450 453 468 477 4B6 

ATA CCA TTA CCC AGT AAA GTT TCA AAC CTA ACA GAC TTC TAT CTA ACA ATC TCC 
He Pro Leu Pro Ser Lys Val Ser Asn Leu. Thr Asp Phe Tyr Leu Thr He Ser 
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lie Glu Ser Trp 

549 S67 
«» ACG ACA GAA CCT TGG AGA ACA ACA CCA ATT 

- - Ar S Glu „. Trp ^ £ « £ £ AGO GAT GAG CAA GAA GTA 

V He Au ser Asp Glu Gin oiu vu 

S " «« 621 

ATG ATA TGG ATT *ran ~* 530 fii a 
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